Upcoming v.14 w/ KNL v8 Performance Improvements

Good news – I’ve finally made some progress with performance improvements for v8. I’m not yet where I want to be (ie, where I think the true “speed of light” for the KNLs should be), but as of last night, I finally got a newly vectorized version that was already running around 30% faster than the previous one. There’s more potential for optimizations in that version – it’s a complete rewrite, so lot still in flux – but 30ish% is already enough to at least share this version. I’ll need probably tonight – and maybe tomorrow – to do some more testing, burn in, packaging, etc, but “something” should be coming up soon.

BTW: Just to explain where the entire v8 performance issue came from: In the past, the cryptonight family always stressed only on memory performance, with relatively little “compute” thrown in … yes, there was the AES encoding step, and the 64-bit multiply, but both are hardware-supported on CPUs and KNL, so the “true” cost was exclusively in the memory system. With v8, this has changed – there’s now some pretty nasty 64-bit division and double-precision floating point multiply (plus a lot of additional gunk) in the inner loop, and these are pretty compute intensive. To get these pieces fast I had to completely change the vectorization pattern in the inner core, and doing that is a pain if ever there was one: do a single bit wrong and you get a wrong result, and since all intermediary numbers are completely meaningless semi-random bit-patterns it’s near impossible to reasonably debug …. all you can do is write gigabytes of log files of every operation performance, and compare them bit by bit.

Anyway – that restructured code is now working. Lots of opportunity to do some low-level optimizations, and probably even a reasonable way of porting all that to KNC, too (which needs the same kind of vectorization) …. but at least in the short term I had to change a lot, probably broke a lot (including the regular CPU version :-/), etc. It’ll take a day to clean up and release a first version, but from then on we’re back on an upward ramp. Happy!

With that – happy mining!


PS: Just to give you guys an idea of just some of the things I had to deal with on this rewrite: The newly vectorized code needs to do some 64-bit integer divisions, and though KNL can do that in AVX512F the respective intrinsic for that (_mm512_div_epu64) is not even supported in either clang or gcc (not even in the latest top-of-tree’s, let alone released version); and though the Intel compiler does support this operation you need the brand newest latest intel compiler to even run on ubuntu 18; and …..

XMR-v8 fork: Remember to UPDATE YOUR MINERS!

Today is the day for the v8 fork – and given that hash rate on Dwarfpool just took a precipitous drop I assume that it just happened. As such: Make sure to check your miners, and update to 12.1 as soon as v8 is active!

For those using the Phi 7220/7240 PCI cards – make sure to use 0.12.1, not the 0.12.0 I posted earlier this week – 0.12.0 works on socketed phis, but had a bug in the MPSS offload code, which got fixed in 0.12.1.

With that – happy mining!

PS: And of course, also change “-a xmrv7” to “-a xmrv8” in your mining scripts!

A Brief update on TRTL and AEON

Over the last week, I had at least two people ask me about updating the miner to support the cryptonight light algorithm required AEON and TRTL. When I got these requests, I was a bit confused … I thought I had updated those ages ago … but who knows, maybe there had been another fork!?

Well, I didn’t have any time to look into it until earlier today; but having now just re-tested the respective two command-lines from the “supported coins” page, I still am confused: at least on my side both AEON and TRTL are running just fine. That said, nobody seems to have run either one of these two coins for months (there’s no dev share activity for them), so there seems to be some issue that I cannot reproduce.

As such: If anybody did want to run those coins, and ran into issues with it: Please let me know. The only issue I could think of is that the miner that is preinstalled on the lukSticks is too old (in which case all you have to do is update the miner on those sticks), but otherwise it should work just fine.

All that said – AEON and TRTL might not be the most profitable coins for Phis: The big advantage of the phis is that they have lots of MCDRAM, so the “heavier” the coin the bigger the (relative) advantage over CPUs with smaller caches – so at least if the market forces are only mildly in effect CPUs with small caches should be most profitable on “light” coins, and phis most profitable at “heavy” ones.

With that – happy mining!

xmr-v8 ready to go …

Another heads-up: I finally manged to stress test the v.12 version that supports xmr v8, and at least on the test net it works perfectly fine also on the phis. I haven’t ported it to KNCs, yet (KNL has many more users, and thus much higher priority :-/), but at KNL seems to work fine.

The latest release (v0.12) is available at its usual place (http://lukminer.net/releases) – but make sure to remember to change your algorithm flag (“-a”) to “xmrv8” once the fork hits. And of course, do not use the v8 flag before that fork happens.

Finally, a note on performance: Don’t be too surprised if you’ll see significantly lower hash rates once v8 hits – the additional operations they added to the inner loop are really expensive, so on the (non-asrock) development machine I was using I’m seeing a drop from about 2500 to about 1700 H/s. That’s a little bit more than I expected, but as I just said: The additional operations are expensive, so any other CPU or GPU miner will likely see quite an impact on hash rate, too … which means difficulty should adjust accordingly, at least after a few days.

With that – happy mining!

Heads-up on XMR v8 support …

It’s been a while since I last posted (I’ve switched employers, and been rather busy in the next job …), and as I can see from the many emails I got quite a few people had already started to wonder whether I had disappeared from the face of the earth completely – and in particular, gotten a bit concerned what would happen with lukMiner once the upcoming Monero v8 fork will hit the road (which looks pretty soon now).

Thus – to hopefully put those fears to rest, here a quick heads-up: As of a few minutes ago I finally finished a first draft of the XMR v8 changes, and at least on the testnet they seem to be working fine. I haven’t done full testing and burn-in, yet, so final release may take a bit longer – but still: so far everything looks good for a “v8” version even before the actual switch will happen….

With that – happy mining!

Created new “static” supported-coins overview page …

I have in the past tried my best to document which coins lukMiner does support – by keeping an up-to-date readme with each version, by having my release scripts automatically post that together with the latest releases, by having the lukMiner binary print example commandlines when launched without parameters, by posting new blog articles every time I added a new coin, etc … but still, with the flurry of new cryptonight coin variants several users asked for some better “overview” of what is supported, how to call it, etc (I guess a new article for every new coin is good and well – but not all too useful if you’re new to lukMiner and have to google through 10 such articles).

In light of those requests, I did spend some time trying to figure out how WordPress really works, and did finally manage to create a new “static” page that is not a blog, but accessible from the main page of http://lukminer.org: It’s called – who’d have guessed – “Supported Coins”, and is also statically linked to its own URL: https://lukminer.org/supported-coins/).

In the future, I’ll still add updates to the README.md as I did in the past, but will also continually update this page with any newly supported coin, changes in how to execute them, etc. In particular, I’ll use the example command lines on this page to do my own testing – so if any of those don’t work, please let me know (and/or check your firewall settings).

With that – Happy Mining!

lukMiner v0.11.4 adds Stellite and Masari

Upon repeated popular request, I just added Stellite and Masari…

In the last few weeks there’s been a veritable flurry of different new “cousins” of algorithms in the cryptonight family – it all started with monero going v7, then aeon v1, haven, alloys, stellite, turtle, sumo, loki, ryo, niobio, and at least a dozen others that are all virtually identical, but with minute differences in their core algorithm. Adding those is relatively easy – they often differ in only a few lines of code (templates are your friend!), but it does take a certain effort to set up a node, create a wallet for the dev share, do some testing and burn-in, bake a new release, update the documentation, etc; as such, I usually add new coins only when they seem to be “real” (and even then, only on the next weekend :-/), but once a few users ask for it, I usually do add it at some point in time.

As such: I have the honor to present – ta-daa – the new version v0.11.4 that primarily adds Stellite and Masari. I’m also almost ready with IPBC and Arto, if only I get them to compile on the cloud node that hosts my nodes….

Please note I did run them for a little while for testing, but this time did not do a full 24 hour burn in test. If there are any issues, please let me know.

With that:

Happy Mining!