Optimized IsFileValid #65

Merged
takahiro0327 merged 6 commits from huge_scene2 into master 2024-05-25 06:54:25 +00:00
takahiro0327 commented 2024-05-23 11:06:53 +00:00 (Migrated from github.com)

Fixed IsFileValid was slow for large files over 200MB. Please review and merge.

400ms became about 80ms.
Converted the token search algorithm to BoyerMoore and to run on a thread pool.
When using NVMe, the major cost was token lookup on the CPU.

Fixed IsFileValid was slow for large files over 200MB. Please review and merge. 400ms became about 80ms. Converted the token search algorithm to BoyerMoore and to run on a thread pool. When using NVMe, the major cost was token lookup on the CPU.
ManlyMarco (Migrated from github.com) reviewed 2024-05-23 15:24:04 +00:00
ManlyMarco (Migrated from github.com) left a comment

It looks like for very large scenes there may be hundreds of threads created, which will be slower and could possibly cause issues by making thread pool fill up.

There's also a possible race condition where the last chunk completes before a previous chunk with the tag in it completes, causing a false negative.

Maybe it would be possible to use RunParallel? It would avoid the race condition and only use as many threads as necessary.

It looks like for very large scenes there may be hundreds of threads created, which will be slower and could possibly cause issues by making thread pool fill up. There's also a possible race condition where the last chunk completes before a previous chunk with the tag in it completes, causing a false negative. Maybe it would be possible to use [RunParallel](https://github.com/BepInEx/BepInEx/blob/v5.4.23.1/BepInEx/ThreadingHelper.cs#L217)? It would avoid the race condition and only use as many threads as necessary.
takahiro0327 commented 2024-05-23 23:17:18 +00:00 (Migrated from github.com)

Thanks for review.

It looks like for very large scenes there may be hundreds of threads created, which will be slower and could possibly cause issues by making thread pool fill up.

The PoolAllocator was restricted by setting an upper limit on the number of instances.

There's also a possible race condition where the last chunk completes before a previous chunk with the tag in it completes, causing a false negative.

Sorry, I was an idiot. Fixed.

Maybe it would be possible to use RunParallel? It would avoid the race condition and only use as many threads as necessary.

It's in the form of making a list at the offset of the position to be read and executing it in parallel, right?

In many cases, it is faster to do nothing but sequential reads. This is especially true for HDDs with physical moving parts and slow SDDs.
It should be better to call fs.Read() sequentially in the main thread to get better speed in many environments.
It is not the fastest in certain environments, but considering the penalty in other environments, I think the current form is better.

Incidentally, in my environment, it appears that a maximum of 5 pools are used for a 200 MB scene.

Thanks for review. > It looks like for very large scenes there may be hundreds of threads created, which will be slower and could possibly cause issues by making thread pool fill up. The PoolAllocator was restricted by setting an upper limit on the number of instances. > There's also a possible race condition where the last chunk completes before a previous chunk with the tag in it completes, causing a false negative. Sorry, I was an idiot. Fixed. > Maybe it would be possible to use RunParallel? It would avoid the race condition and only use as many threads as necessary. It's in the form of making a list at the offset of the position to be read and executing it in parallel, right? In many cases, it is faster to do nothing but sequential reads. This is especially true for HDDs with physical moving parts and slow SDDs. It should be better to call fs.Read() sequentially in the main thread to get better speed in many environments. It is not the fastest in certain environments, but considering the penalty in other environments, I think the current form is better. Incidentally, in my environment, it appears that a maximum of 5 pools are used for a 200 MB scene.
ManlyMarco (Migrated from github.com) approved these changes 2024-05-24 18:05:02 +00:00
ManlyMarco (Migrated from github.com) left a comment

Makes sense. Thanks for the fixes, everything seems to be good now.

Makes sense. Thanks for the fixes, everything seems to be good now.
@ -41,6 +43,65 @@ namespace IllusionFixes
return IsFileValid(path);
ManlyMarco (Migrated from github.com) commented 2024-05-24 18:01:33 +00:00

Maybe a timeout would be a good idea just in case? I think there's no ill effect in normal conditions.

                        Monitor.Wait(_queue, 2000);
Maybe a timeout would be a good idea just in case? I think there's no ill effect in normal conditions. ```suggestion Monitor.Wait(_queue, 2000); ```
ManlyMarco (Migrated from github.com) commented 2024-05-24 18:01:47 +00:00
                        Monitor.Wait(_queue, 2000);
```suggestion Monitor.Wait(_queue, 2000); ```
takahiro0327 (Migrated from github.com) reviewed 2024-05-24 22:56:38 +00:00
@ -41,6 +43,65 @@ namespace IllusionFixes
return IsFileValid(path);
takahiro0327 (Migrated from github.com) commented 2024-05-24 22:56:38 +00:00

I added it just in case.

I added it just in case.
Sign in to join this conversation.
No description provided.