-
Notifications
You must be signed in to change notification settings - Fork 43
Add neural speed example #135
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Hello, I am a code review bot on flows.network. Here are my reviews of code commits in this PR. Overall Summary: Potential Issues and Errors:
Most Important Findings:
DetailsCommit 612c2c396653f0911f3ded717016627f41a9b51aKey Changes:
Potential Problems:
Overall, the patch introduces new functionality efficiently but could be improved in terms of error handling and documentation. Commit 14cb1941e1cde1feecdd7b70f5bd4cca6503e125Key Changes:
Potential Problems:
Overall, it's important to review the impact of replacing 'context.fini_single()' with 'graph.unload()' to ensure that it aligns with the project's design and functionality. The changes should also be properly documented for better understanding by other contributors. Commit 75d677266bbf80900b0038f65de3261908334ccaKey Changes:
Potential Problems:
|
|
Would love to see a performance comparison of the same model on llama.cpp on Intel CPUs. |
|
I run a simple test on i7-12700K.
|
|
Does that mean it is actually slower than llama.cpp? |
|
Yes, the current result shows neural speed is slower than llama.cpp. In addition, directly run neural speed also can not achieve llama.cpp runtime. |
|
@grorge123 |
|
Neural speed uses all the CPU cores(20). I update the neural speed vision to 1.0. I test on another computer i7-10700. The other variables are the same.
In this case neural speed has a better performance than llama.cpp. But I have no idea why i7-10700 has better performance than i7-12700K. |
No description provided.