To all KNC/MPSS users: Please immediately update to http://files.lukminer.com/lukMiner-0.9.2.tgz !
First of all: To all those that volunteered to be guinea-pigs and try the MPSS offload version I added in 0.9: A giant “mea culpa, mea culpa, mea maxima culpa”…. I hardly know how to write this without feeling like a complete ass, but that MPSS offload version did indeed have a major flaw that led to the KNC device code getting stuck in mining developer shares after 10 minutes, so to whoever ran that code: your KNC card(s) have been mining for me, and only for me, since 0.9 came out …. oh man, I don’t know what to say….
Reason I didn’t spot this sooner – even after two KNC/MPSS users reported un-expectedly low hashes for their accounts – is that this bug appears only in the MPSS offload version (not in cpu, phi, opencl, cuda, or even knc native mode) … and even when running that mpss mode it only happened for the shares computed on the KNC (the CPU threads still mined for the user)… and even then, it’d happen only after a few minutes… and even then, you’d only “really” see it if you ran without the CPU threads…. and even then, the outputs looked absolutely right, ….. which made it all so hard to reproduce. But still, those are nothing but empty excuses; I did verify the bug existed, and now I do feel like said “complete ass”.
Now, what next? First of all: Once again, mea culpa, that should not have happened. Also many, many thanks to those that reported this bug, and still stuck with me … I owe you one. Next: To every body that did run this MPSS version for a considerable time: let me know how much you think you should have made, and I’ll glady reimburse you (it can’t be that much; the MPSS shares are rather small). In addition: as another sign of how sorry I am I just changed to KNC miner share from 4% to 1%, indefinitely. And finally: if you do intend to run in MPSS mode, please update your miner to 0.9.2 ASAP (here the link: http://files.lukminer.com/lukMiner-0.9.2tgz).
Again, my most sincere apologies… I don’t know what else to say …
Update 2/11: Updates the link from “0.9.2rc2” (release candidate 2) to “0.9.2” (the actual release).
15 thoughts on “x100 MPSS Users: Major bugfix release!”
First, no apology is required. You made full disclaimer this was a work in progress. You owe me nothing, if I’m making something from you hard work.
We will still owe you in the end and reduction was greatly appreciated. Without Lukminer, there is no Phi mining.
Thank you indeed! I am always amazed by the community of this blog’s readers: I have yet to see a _single_ negative comment or feedback in any of my posts or comments (and that in an age of hate-posts and trolls pretty much everywhere else!)…. and even in this instance, where the fault was clearly mine, only supportive comments. Highly appreciated!
I totally agree with MrPoet, you owe me nothing.
I have a question and an issue for you.
The issue is that the -h/–help argument doesn’t work, the question is, how can I run the opencl version without using the cpu? I tried -t 0 but one core goes 100% anyway. Is there an argument to select which opencl devices to use?
Thanks again for your work
PS: The knc native miner dev fee are still 4%, I keep using this version 🙂
Thanks for reporting; just filed those to my bug tracker.
– The “-h” is a bug – seems that broke in the rewrite for 0.9x. Will fix for next release.
– The “-t 0” is the right thing to do for not using the CPUs – the reason one core remains running is likely the polling for results. I’ll see if I can reduce that, shouldn’t be too hard.
– Dev share for the native version – sigh, yes, you’re right, I only changed the MPSS binary; will fix in next version!
THANKS!! I set up Centos 7.3. with PHI 3120A and Intel i3 4130 with your tutorial, works really good!
[15:18:41] knc device #0: share FOUND (nonce 0x00022A19)! (hash rate this thread = 541.797H/s)
[15:18:41] submitting share w/ difficulty 15420
[15:18:41] -> share *accepted*: 218/218 (100.00%) – total hashrate 612.97H/s (may take a while to converge)
[root@localhost ~]# micinfo
MicInfo Utility Log
Created Tue Feb 13 15:16:25 2018
HOST OS : Linux
OS Version : 3.10.0-514.el7.x86_64
Driver Version : 3.8.3-1
MPSS Version : 3.8.3
Host Physical Memory : 3667 MB
Device No: 0, Device Name: mic0
Flash Version : 2.1.02.0391
SMC Firmware Version : 1.17.6900
SMC Boot Loader Version : 1.8.4326
Coprocessor OS Version : 184.108.40.206+mpss3.8.3
Device Serial Number : ADKC33300924
Vendor ID : 0x8086
Device ID : 0x225d
Subsystem ID : 0x3608
Coprocessor Stepping ID : 2
PCIe Width : x16
PCIe Speed : 5 GT/s
PCIe Max payload size : 256 bytes
PCIe Max read req size : 512 bytes
Coprocessor Model : 0x01
Coprocessor Model Ext : 0x00
Coprocessor Type : 0x00
Coprocessor Family : 0x0b
Coprocessor Family Ext : 0x00
Coprocessor Stepping : C0
Board SKU : C0PRQ-3120/3140 P/A
ECC Mode : Enabled
SMC HW Revision : Product 300W Active CS
Total No of Active Cores : 57
Voltage : 985000 uV
Frequency : 1100000 kHz
Fan Speed Control : On
Fan RPM : 3000
Fan PWM : 52
Die Temp : 82 C
GDDR Vendor : Elpida
GDDR Version : 0x1
GDDR Density : 2048 Mb
GDDR Size : 5952 MB
GDDR Technology : GDDR5
GDDR Speed : 5.000000 GT/s
GDDR Frequency : 2500000 kHz
GDDR Voltage : 1501000 uV
Cpu Temp: ……………. 81.00 C
Memory Temp: …………. 64.00 C
Fan-In Temp: …………. 42.00 C
Fan-Out Temp: ………… 65.00 C
Core Rail Temp: ………. 64.00 C
Uncore Rail Temp: …….. 63.00 C
Memory Rail Temp: …….. 63.00 C
Nice! I remember “from the olden days” (when I worked on KNCs in my day job) how much of a challenge installing the MPSS stack could be, so glad those step-by-step instructions make it easier!
In particular glad to hear from somebody that actually benefitted from it – it’s actually a surprising amount of work writing those articles, so it’s the hearing from somebody that is was actually useful that makes it worth it! Enjoy!
“I only changed the MPSS binary; will fix in next version!”
Awesome! I’ve been looking forward to running native to see if there’s any difference from MPSS.
“Shouldn’t” be any major difference (other than the network setup getting more tricky (in native mode the KNC themselves need network set, while in MPSS mode only the host does).
Fair warning, though: Those KNCs are _not_ as well tested as the other platforms, because I only run one pair of them, in a machine I don’t often get to. In particular, one user already reported that he _thinks_ that from time to time the MPSS version runs into some issues (to be more exact: when the miner restarts on dropped connection – nicehash, my bane – then “apparently” the performance after that restart is lower – probably two instances running on the KNC in parallel :-/). Will take a while to fix
I’ve noticed dropped connection issues on both 0.9.0 and 0.9.2. I’m not sure if it’s because it’s switching to the dev share, or if it runs into connection problems, and just returns to the dev share by default, but every time I’ve seen it, it looks like this:
[12:58:49] semi-fatal error in serving stratum 1: could not read from socket!
… cancelling active job and reconnecting
[12:58:49] connecting to pool xmr-usa.dwarfpool.com:8080
[12:58:49] semi-fatal error in serving stratum 2: could not read from socket!
… cancelling active job and reconnecting
[12:58:49] connecting to pool xmr-usa.dwarfpool.com:8080
Could this be a problem with the pool you’re using? I can’t imagine that there’s different socket code for the dev side and the user side.
Also, 0.9.0 would segfault quite a bit. Upgrading to 0.9.2 seems to have fixed that. I dumped a log:
Probably irrelevant now, but I thought it worth mentioning. I can provide core dumps if necessary.
I’m happy to help debug native mode a bit more. Instead of port forwarding, I’m going to bridge them on to the local ethernet and let them get an address from the network’s DHCP server. Then they’ll be accessible directly from anywhere on my network.
Lastly, you listed hash rates for 3120 and 7120. I’d like to report that the 5120 averages 580H/s.
Re the 5120 – just put it on the web page… thank you!
Re the “connection errors” – that does indeed look like dwarfpool is frequently dropping the connections; however, it’s actually (in this case) nothing to be afraid of: Unlike other miners lukMiner doesn’t hold a single connection that it sometimes swtiches from devshares to user shares, but instead actually holds multiple connections at once; and will pick from those in parallel (depending on the miner share). In that case, if one of those drops it will give that message and reconnect, but does that _not_ mean that the miner is _only_ connected to this connection! What I presume is happening – in particular since you seem to be using KNCs – is that for longer times there’s nothing mined (or at least, found) at all on the devshares, so dwarfpool says “Hey, there’s nothing happening on this connection, I’ll drop it”… and the miner will simply reconnect as soon as this connection is detected as dead. That’s perfectly OK, and it’s only the output that’s misleading.
Re the core dumps in 0.9.0 – yes, that was a nasty bug: it did try the reconnect just as in 0.9.2, but the connection was refcounted, and when it threw the exception that recount got decreased, so it would try to connect to a pool whose memory was alrady deallocated …. which sometimes works, but often obviously doesnt 🙂
Just discovered lukminer and thank you for your work.
Is it possible to select the device to use?
I have a gpu RIG and mining many crypto, so it could be nice to select the device.
Yes, you can. Apparently the help output is somewhat broken in the current output (I’ll fix that next version), but ’til then:
“-cl 2,4,5” or “–cl-devices 2,4,5” would tell the miner to (only) use cl devices 2, 4, and 5 (you’ll see in the output which ones those are).
With “-t 0” you can force it to _not_ use the host CPUs (though usually, why not?).
Nice, thank you very much.
Succeeded in getting luk-xmr-knc-native to run on a 3120A under Win7 tonight. The hardest part was getting the networking set up, some clues can be found from my posts here:
Hashrate is 544H/s and dev share seems to be hovering around 4.3%. I’d insert a screen shot but evidently this blog doesn’t allow it.
Tried the new 0.10.0 version and dev shares are 1% as promised.