(Experimental) RandomX support…

As long promised I finally did find some time to work a bit more on my RandomX support (the things one does over Thanksgiving “holidays” – sigh)… and at least experimentally, uploaded a first version with randomx support (use “–algo rx”). As usual, downloads are on http://www.lukminer.net/releases/ .

To indicate that this is the first version that supports anything other than cryptonight variants I went all out and tagged this one as “2.0.0” (which yes, means that I completely skipped all 1.x version numbers – well, let’s not go into that). Note I added rx support only for phis – almost all my users are only interested in that, anyway, and for regular CPUs there’s plenty other miners out there that support it, too…. so.

In terms of performance: On my 68-core Phi 7250 I’m getting ca 6800 hasher per second. That’s “decent” if you compare it to what the benchmark pages report for mid-range Xeons, and pretty good compared to GPUs … but quite a bit lower than what a high-end Epyc will do. For those that are wondering: RandomX (at least in my current implementations) stresses mostly scalar performance, and thus benefits from high clock rates and multi-issue out-of-order … none of which the Phi was designed for: Yes, it does have 4-way hyperthreading (which helps), and some sort of out-of-order processing – but it runs at only 1.4GHz (3x lower than an Epyc!), and only has a single scalar pipe (it has one more pipe for vector instructions, but there aren’t many). For CryptoNight the MCDRAM was the big saviour (CN is mostly memory-bound) – but on RandomX, the phi is actually CPU bound, not memory bound, so the MCDRAM doesn’t help that much.

Anyway – we’ll have to see if ~7kH/s is economically feasible …. guess we’ll know in a few days how price and difficulty will pan out. I also have a few more ideas on optimizations … well, we’ll see.

’til then – happy mining!

Service back up again … check your miners!

Hey,

To all those that do run monero: Yesterday morning supportxmr – the pool I use for my devshare – accidentally blocked my devshare address, thinking all the many different connections were a botnet. And since the miner won’t do anything if it can’t reach the devshare pool this not only brought down my devshare, but pretty much everybody else’s mining, too.

This has already been fixed (those guys were really helpful!), and I’m already thinking about ways of how that can be avoided in the future – but be that as it may, I’d suggest you all check if your miners actually do anything useful; as far as I can see from the devshare hash rate most of the miners are still offline despite the devshare being unblocked, so I assume there’s a lot of them where the machine itself crashed, or for some other reason did not come back up after the miner was down for so long…. and before you have your machines running all idle – best check them!

With that,

Happy Mining!

 

Update: LukSticks, and Higher Perf…

Good news up front

  1. I finally made some new luksticks (I’ll write about that separately), and
  2. I managed to get quite a good performance boost for v4r last night
    (from 1700ish to 2200ish on my 7210, and from 1900ish to 2350ish on my 7250)

Man, what a day, yesterday.

First had a long, gruelling long hike with my dogs, and thought I could hardly lift a finger after that any more, I did finally get in the mood of locking myself in my with 7250 workstation and doing some serious coding (because with the lukSticks finally done – see below – I finally did have some time left to do that!).

Locked myself away from everybody else, got some uninterrupted coding time (oh, so rare), and got to work. Got on a roll, finally found the time to do a lot of the code re-orgs that I’ve been talking about for ages – much better vectorization – and managed to constantly chip away on hash rate, one instruction at a time. In fact, that entire process worked so well that I eventually worked way into the night (it’s so hard to stop while more perf seems to lie around the corner!) until finally, at about 1am I called it a day (or called it a night?) – having manged to bump performance on my 7250 workstation from about 1900H/s to nearly 2400. Wonderful. Now just build a release, quickly test it on my 7210, and finally get to bed.

Not so. First, building the release took a while, so I had to stay up til about 2 watching the build to run through – which, when dead tired and at 2am – is about as thrilling as watching paint dry…. in particular if it literally takes an hour.

But finally, release is done, so just quickly download onto my 7210 Asrock machine for final perf measurement (because most people on this blog are more interested in 7210s than in what my workstation can do!), then just write a quick note on the blog, and finally get to bed, after a nice and satisfying day’s achievement. Right?

Not so. Downloaded the baked release, put it on a lukstick, went to my garage, popped it in, booted …. and saw no difference. None. Nada. Zilch. Of course, first thought was “oh, those darn luksticks again – probably messed up the miner upgrade”… but no, miner shows new version number, just as it should. Still, no difference. Reboot, try again. Nothing. Try another node, same thing. Conclusion: Apparently all my work was for nothing – maybe it did work on a 7250, but not on a 7210? Argh. Went to bed, totally devastated. What a bummer.

Except. Of course, that story just didn’t make sense, it simply couldn’t be… and in fact, my wife had the right idea from the start, as soon as I whined about my experience this mornin: “maybe you still ran the old code, and it just wrongly showed the new version?”. Doesn’t sound plausible, except that’s exactly what happened: Since 7250s are not exactly the fastest at compiling and linking I do bake my releases on a different machine, so before doing that I have to commit and push on my workstation, then pull on the baking machine, bake there, and wait for the goodness to happen. And since I had forgotten to bump the version number before committing (yes, I was tired) I didn’t actually bump the version number until I was on the baking node – and since git had apparently had some conflict when pulling the latest release (which I had completely overlooked – yes, I was tired) I had never actually pulled the new code at all. So I did in fact bake the old code with the new version number. Yay. (If you learned anyhing: Don’t bake releases when you’re dead tired – it can wait a day).

Anyway – just had another look this morning after at least a few hours’ sleep; found the git pull that hadn’t gone through, baked a mini-release for testing, put that on a lukstick, and ta-daaa: 2200 H/s on my 7210. A second node I tested was slightly slower – say 2100 – but even that is a massive speedup over the 1700ish from just yesterday morning.

All in all, a few weeks ago we were down to less than 30 cents/kH at 1700 kH (the first v4r version even had less than 1kH/s!) …. and today, we’re up to 2200ish kH/s, and that at today’s nearly 50 cent kH … that’s quite a difference. $1/day is a far cry from the $20/day in the crazy times – but it sure beats 30 cents/a day!

Aaaanyway – that was a long story just to say: Performance is up! I’ll be doing some more testing soon, just to make sure that I didn’t accidentally disable CPU version, MPSS version, or other coins like haven again :-/ (I actually do that during development to keep compile times low…), and will then bake a “real” release. No more optimizing today – today will be all about testing, and releasing.

Anyway – it’s all good news; I’ll write a bit more about the new lukSticks, soon, too (and that was a real nightmare!), but for now, back to making this release!

With that – happy mining!

v.15.6 – CPU re-activated, and a bit more perf on the Phis

Just uploaded v.15.6 – this was primarily built to re-enable the CPU path, but while looking at that I also realized that I had some pretty egregious spilling code in the KNL path that wasted cycles. In .15.6 this is now fixed. Latest performance:

  • on my 7250, with Ubuntu: +/- 1940 H/s
  • on a 7220A active card: +/- 1840 H/s.
  • 7210 …. still have to check, but not today.

Quite interestingly, at those rates we’re coming back to profitability numbers we haven’t seen in a while: at least if the profit calculators can be belived, at current reduced difficulty 1.9kH is back over 80cents a day. Not as much as during the crazy days (I still remember the $7.50 a day :-/), but way better than the $.20 we’ve seen recently!

With that – happy mining!

v0.15.5 with v4r support is out

Since I did get a few emails asking whether the new version is now out: Yes, it most certainly should be!

I did have some last-minute glitch in that all my testing machines apparently burned through one more of the wires in my home’s electrical wiring (third time :-/), so a few wall sockets lost power, including the one to my router – so the release was baked early yesterday, but didn’t get pushed out right away because the router was dead. Oh, heck, it all happens at the same time. Either way, a single extension cord from anothe room fixed that, and the 0.15.5 version was pushed some time before midnight yesterday night.

And as always, it should be on the releases page, under the following link: http://www.lukminer.net/releases/

A few notes:

  • this version should have code for at least a month or two’s blocks compiled in. I’ll update newer once when it’ll get close to that.
  • In this version I did fully deprecate support for the x100 generation phis. If somebody can convincingly make a case for re-activating that I can still do that – but it’d literally have to be thousands of those x100 generation phis to make that worthwhile, and I doubt those would be profitable.
  • In the latest drop I’ve disabled the CPU version; partly because this reduced the compile time by 4x (yes, CPU build is 3x more expensive because of the different instruction set specializations….), and partly because I thought nobody used that any more. I already got comments to the contrary, so will re-activate that in next drop.
  • I have not yet gotten around to fixing smaller coins like turtle or aeon.

Either way – v0.15.5 is out, and should be working.

With that – happy mining!

v4r is live.

Blockchain height 1788000 has just passed, so if you’re mining monero, you have to use the new miner.

Good news: Performance is roughly where I’d want it to be:

Screenshot from 2019-03-09 11-35-28

Bad new: I did (of course!) introduce a final bug when I built the release last night, so the v0.15.1 version currently on the downloads page will actually crash at that blockchain height. (Apologies – but that happens when you can’t actually test on the ‘real’ blockchain, yet :-/). The fix was literally 10 lines of code … but still – I’m currently baking the new version, and will upload as soon as the make is through.

Oh-kay … v4r, here we come.

Oh-kay – here we are. Two-three days ago I promised a version “probably tomorrow” …. but then – in my hubris – decided to do an apt-update on my KNL development machine … which let to a reboot … during which it realized that the harddisk has some broken sectors … which means it could no longer boot … or even mount the file system on another machine … which …. darn – which better time to lose a harddisk than two days before a release!? Seriously??? Anyway – most of the code was safe under git, it just took forever to reinstall that machine to a state where I could actually develop on it….

Aaaanyway. Release. v4r. I just tagged and built a first internal release with all the right code, including updated version numbers, mpss support, etc. A bit more testing, then I’ll post that public. To make up for lost time in that version I disabled KNC completely, and I think v4r will only run on knl/phi and mpss-knl right now, but at least on my machines it’s working great for both of those configs.

As to performance: What I’m seeing right now is close on 1800-2000 H/s on my 7250, which is pretty good. Exceeeeept…. there’s one caveat: The new thing in v4r is the “randomized code generation”, which means that every block height will run different code during hashing. To do that, there’s basically three options: a) run an interpreter, that just interprets that code sequence on the fly; or b) add a just-in-time compiler to the miner, or c) pre-build all (or at last, a lot) of the upcoming block height’s codes, and link them all together into the same binary. Since I’m not going to add a compiler (well – not yet, at least) I went for a mix of a) and c) : I do include some blocks (for those I get ca 1.8-2kH/s), but for then have a interpreter fall-back for those not built into the binary (and for those, we’ll only get 0.9-1kH/s).

Now the hope is that I’ll eventually ship a binary that has the next few months’ codes all built in – but at least right now I’m having “some minor issues” with the compiler spending literally hours to build a release when too many blocks are built in … and to make matters worse, the test net runs on a different block chain height than the real one will, so I’ll actually have to release different code than what I can test. As such, first version with v4r support might not actually have the right blocks compiled in, yet, and therefore may be running at half the speed. I do have a bigger build with more blocks running in the background …. but …. as I said, it might take a while.

Anyway, two key takeaways:
– v15.1 with v4r support currently building.
– expected performance (once the right binary is our): somewhere around 1800-2000H/s

With that – happy mining!

(PS: Now where did I put that compiler book ….. ?)