

https://unsloth.ai/docs/models/qwen3.6#mtp-guide
Unsloth made a guide and has graphs with comparisons
I’m just here for the moral superiority.🌱
Mainly interested in FOSS
Currently in uni and working part-time as a developer and system administrator.
PC Specs
CPU: 7800X3D
GPU: 7900XTX
Memory: 64GB
System: Arch


https://unsloth.ai/docs/models/qwen3.6#mtp-guide
Unsloth made a guide and has graphs with comparisons


You can also contribute to OpenStreetMap in your area using simple apps like StreetComplete or EveryDoor. This has a way lower barrier to entry than contributing code in my opinion. And it has the immediate benefit of a better local map for a LOT of services that are built on top of OSM.
I’m really not fond of the profiling by automated means, but it seems like an inevitable consequence of the design of the threadiverse. Everything is public and easily accessible by anyone that would like to profile you.
I certainly disapprove of moderation based on ideology. Moderation should be based on quality of the content and if it fits in the publicly readable rules. Definitely not some hidden analytics or if the user completely fits in the in-group of the moderator.
I will admit that this might be a good way to find and filter out LLM based bots that are only there to promote or manipulate the conversation. But it should still be done according to public rules.


Is this post written by an LLM?


I’m no expert, but basically the way to unlock higher/full bandwidth for HDMI 2.1. This will allow the use of higher refresh rate, resolution, and bit depth + HDR. Right now you need to make sacrifices in at least one category with HDMI


What is the difference between this implementation and the reverse engineered patches that were published a few months ago by Michał Kopeć and Tomasz Pakuła?
Edit: apparently it’s not the same patch, but Tomasz was CC’ed in the patch set so the timing might not be accidental.


I’m European and had to do the same, so it’s based on something else.


Don’t know about Ubuntu specifically but for all software I actually want to work, I wait for the first point release upon a major release.
Artificial Analysis just posted their results and there seems to be a similar increase in output token usage as the 35B model.



Ah, I don’t know anything about Windows. I’m using Linux and both the latest ROCM (7.2.2) and latest vulkan (26.0.5) packages work without issues for combined gaming and AI. My reported numbers were with Vulkan at zero context for reference.


I’ve been using it for the past few days and the output quality seems to be on par or slightly better than 3.5 27b. The biggest issue is the token usage that has exploded with this revision. It can easily reason for 20k-25k tokens on a question where the qwen3.5 models used 10k. Since it runs more than 3 times faster, it still finished earlier than the 27b, but I won’t have any context/vram left to ask multiple questions.
Artificial Analysis has similar findings.



I agree with the suggestion of the other commenters, just wanted to add that I personally run llama.cpp directly with the build in llama-server. For a single-user server this seems to work great and is almost always at the forefront of model support.


I’m running it with the UD_Q4_K_XL quant on 24GB VRAM 7900XTX at ~85 token/s. Since it’s an MOE model, CPU inference with 32 GB ram should be doable, but I won’t make any promises on speed.


AllenAI has released open source models with open training data, code and science. If you value the ‘source’ to actually be open. They’ve also published the multimodal Molmo models.


Such a huge increase compared to previous months, with most of it coming from ‘64 bit’ and ‘0 64 bit’ seems suspicious. Don’t give me false hope…


Thanks, I added the checkout link to the OP


I got some weird specialised hardware over USB working via WinBoat. Might be an option for some.


Unfortunately, the AI community prefers rushed buggy development over proper, tested releases, so the quants and maybe the PR weren’t fully working.
As of 3 hours ago, unsloth was still updating their quants and guide. I don’t have time to test now but I wouldn’t judge the base model performance in the first few days when the bugs are still being worked out.
They also recommend some unconventional parameters in the Unsloth guide.
It could also be that the model is truly shit of course.
Edit I just took a look at the llama.cpp repo and there are still issues with the implementation as well.


Found it by looking up dark mode Firewatch wallpapers.
Edit: Didn’t find higher resolutions of this specific one. But here are slightly different but higher resolutions variants: https://imgur.com/a/jvkoP
And the second one in this list