(luk-)Mining with x100 “Knights Corner” Xeon Phis (3110, 5110, 7120, etc)

Okay; weekend again, so finally a few spare cycles to answer questions re mining with Phis… Yes, I know I promised an article on updated performance numbers for the second-generation Phis (the x200s), but since at least the PCI cards for the x200s are kind of hard to come by I recently also had lots of people ask me about mining with the older X100 Phis… and since that article is ready to go, let’s do this one first.

X100 Phi Variants … a primer…

With that introduction, this article is about mining with the x100 “Knights Corner” Phis – i.e., the Xeon Phi 7120’s, the 31S1P’s, SC 5110s, etc. I.e., pretty much any Xeon phi with a ‘1’ in the second digit of the four-digit model number (hence the ‘x100’ – who’d have guessed?).

First, for those interested in buying some on ebay, a bit of lore on model names: The ‘1’ in the second digit indicates a first-generation phi. The ‘A’ or ‘P’ at the end indicates whether it’s ‘A’ctively or ‘P’assively cooled (i.e., whether it has a fan or not)… and a fair word of warning: if you think of putting a ‘P’ one into your hobby mining rig at home – think again! (without strong forced airflow from the chassis you’ll quickly end up with something you can fry eggs on (and no, I’m not going to try!).

Apart from that ‘1’ and the ‘A’ and ‘P’, the other numbers indicate the exact number of cores, their clock rate, and how much memory those cards have. There’s too many variants to list here, but if you’re curious: you can look up any specific model’s exact config on http://ark.intel.com . And if you’re curious what hashrate a given x100 will give: it seems you can extrapolate rather easily: take the hash rate of a model you know (say, around 650H/s on a 7120), divide by core count and clock rate of that known model, then multiply by core count and clock rate of the desired model, and you should have a reasonable estimate. (No guarantees – your mileage may vary – I’m not a lawyer and this is not legal advice, etcpp).

Some frequently asked questions re Mining on X100 Phis…

OK, now to some questions I’ve literally gotten several dozen emails about:

Does lukMiner work on x100 phis? Yes, since around Christmas it does; I initially didn’t plan on supporting it, but too many people asked… The current code may not be just as fast as it could be, but I’d guess I’m within 10% of optimal, that’s good enough for now.

What do I have to do to run lukMiner on my Phis? In the first version, you had to manually copy it onto the phis, do some nasty stuff with port forwarding, etc … but that is no longer necessary! Since around 0.8.7 or so (forgot the actual version) lukMiner also supports so-called “MPSS offload”, which is way simpler. A complete step-by-step “howto” is further down in this article.

How do the x100s perform? Of course that varies by the actual model; but here are three typical models:

 Xeon Phi 7120  PCI-card coprocessor x100 (KNC)  61@1.2Ghz 8GB DDR5  ~650 1360
 Xeon Phi 31S1P  PCI-card coprocessor x100 (KNC)  57@1.1Ghz  8GB DDR5  570 mpss offload version
 Xeon Phi 3120A  PCI-card coprocessor x100 (KNC) 57@1.1Ghz 6GB DDR5 545  1130 As reported by user (thanks Jeremy!)

Why are the x100s so much slower than the x200s? Well, you’re comparing a roughly 5 year of piece of hardware (which is, at least according to ark.intel.com, already officially “end of life”d!) against a much newer one…. In particular, you’re comparing an architecture with tiny caches and regular DDR RAM to one with 30ish MB of cache and 16GBs of “high-bandwidth memory” (aka MCDRAM); and one with relatively “wimpy” in-order cores to one with rather powerful out-of-order ones.

Note the x100s aren’t actually all that bad in absolute terms – 650H/s on a 7120 is still comparable with a brand-new 1070, and certainly pays for the power! – but hey, it would be strange indeed if they’d still be compatitive with a x200…

Is it still worth mining on an x100? Well, there you have me. If I did have some spare money to invest I’d rather put it into x200s – they’re just so much more profitable. But if you already have them? Or get them at the right price? Sure, it still pays for the power! And since this is the older generation, there seem to be a lot of old X100 machines out there that are now being replaced with newer generation hardware …. (just snapped up a complete server on ebay, with Xeon cpu, memory, two 5110s, PSU, and everything, for only $800 total!).

What machines do those cards work in? That of course is the nasty question, because they do not work in all motherboards. At the very least, you board’s BIOS has to support “Above 4GB decoding” (sometimes called Large-BAR, for Large-Base Address Range Support). There may be other restrictions, but if there’s not 4GB decoding, it wont’ work. Also, as mentioend above make sure you system can actually cool those cards: If you have active ones (with fan) they’ll work just fine in a desktop case (see pic), but for the passive ones, you should have them in a server with strong airflow (or you’d better get very creative in your cooling!). Here two examples: One it a desktop I built from parts with an Asrock X99  motherboard, in a desktop case; the second is a refurbished 1U server I bought off ebay.

Just to be sure, in that server I did the “usual” trick of clipping the “lower” two fan cable (for the fan control) to force full fan speed (banshee!)… probably not necessary in a cooled data center, but hey, my basement isn’t exactly professionally cooled …. (top right pic; apologies for the low quality, but you can just make out the blue and yellow cables being clipped)

Running lukMiner on a x100 Phi system…

Basically, there’s two ways of running on an X100 phi – “native mode”, and “mpss ofload”. Though you can still run lukMiner in so-called ‘native’ mode, I will from now on assume that this is “advanced usage”, and that whoever wants to do so will already know how to do it. Thus, from now on I’ll assume you want to use lukMiner on x100 phi “the easy way”, using the MPSS stack.

Background: The x100 MPSS Stack

The MPSS (mic platform software stack, or something like that) is the OS software stack used to drive x100 cards. For lack of a better word, it’s what on a GPU you’d call “the driver” :-). If you already know what the MPSS tack is, and already have everything installed, you can just skip to the next section … but then you wouldn’t read this, anway, so I assume you’re new to this.

Step 1: Get the Latest Version of MPSS (Version 3.8.3)

You can freely download the MPSS tack from the following link: https://software.intel.com/en-us/articles/intel-manycore-platform-software-stack-mpss. At the time of this writing, the latest version you’ll find there is 3.8.3 (and since the x100s have been EOL’ed that is unlikely to change 🙂 ). Please use this version; I won’t test or support any others. Once you have it, just follow the next few steps.

Step 2: Chose a suitable Linux Version (ie, CentOS 7.3)

In theory, the MPSS stack supports only RedHat and SUSE (in particular, Debian distributions such as Arch or Ubuntu are not suppert!). I personally don’t like either of those two choices, but luckily CentOS is fully compatible with RedHat, so for the remainder of this guide I’ll go with CentOS. In particular, I’m going with CentOS 7.3, and strongly suggest you do, too, since the steps below may be different on other linux flavors.

Step 3: Install CentOS 7.3

Once you downloaded you CentOS 7.3, put it on a USB stick, and install. I chose “gnome workstation/development tools” to start with, and suggest you do the same. You can always install missing packages later, but the steps below assume that this was what the OS was installed with.

Once installed, reboot, and accept the CentOS license. Note I would suggest to not (ever!) do a software update (aka “yum update”) on this machine, or the MPSS kernel modules to be installed later might not match any updated kernel version any more. You can of course rebuild those, but it’s easier to just stick with 7.3 by simply never updating….

Step 4: Get and Install MPSS 3.8.3

Once you rebooted and logged in, go to https://software.intel.com/en-us/articles/intel-manycore-platform-software-stack-mpss, and download version 3.8.3 for linux/redhat 7.3. You should end up with a file called “mpss-3.8.3-linux.tar[.gz]”. Unpack it:

[luk@x100] tar xavf mpss-3.8.3-linux.tar

Then, go into the resulting directory …

[luk@x100] cd mpss-3.8.3-linux

become root

[luk@x100] sudo bash 
<password>
[root@x100] ....

and install both the rpm packages for both core software stack and kernel modules

[root@x100] yum install *rpm modules/*rpm

That may result in a lot of warnings , but hey, it’s only warning, no errors… (I hope).

Step 5: Configure MPSS

There’s a lot of things you can do to configure MPSS – if you have some time, read the user guide! – but for now, let’s only do what’s necessary to run the miner. In particular, I will only describe how to do this as root – that’s not exactly good policy under linux (yes, I know that!), but it’s the easiest to do…

As such, as root, generate some ssh-key:

[root@x100] ssh-keygen

Now, let’s start the mpss stack for the first time, and initialize it:

[root@x100] service mpss start
[root@x100] micctrl --initdefaults
[root@x100] service mpss restart

Note the first time you run the ‘service mpss start’ you’ll probably see some error messages – that’s fine becaues it’s not been initialized yet (that what the  ‘–initdefault’ in the second step does) – but at least it already loads the kernel modules that the ‘initdefaults’ requires :-).

Once the second ‘service mpss start’ has been done, you should now be able to do a

[root@x100] micinfo

… and should get some meaningful outputs for the mics you have in your system.

Note you’ll have to re-start the MPSS service after each reboot (though of course, you can automate that), but all the other steps have to be done only upon the first install (unless, of course, you change your config or update your linux kernel – which I suggest you don’t).

Step 5: Running the Miner

If you just completed the previous 4 steps the MPSS daemon is still running; if not, first start it:

[root@x100] service mpss restart

Once  the mpss service is properly running, running the miner is simple. Get the latest versoin, unpack it, then simply run:

[root@x100] ./luk-xmr-knc-mpss --host ...

and it should be working out of the box.

A few notes:

  • The miner will automatically mine on both CPUs and all available MICs. If you don’t want the CPU cores to be used, pass “-t 0” on the command line. (but hey, why shouldn’t you use them???)
  • In the simple setup explained above, you have to run the miner as root. You can change that, but that’s up to you.
  • The “service mpss start” has to be re-done upon every boot. Alternatively, you can put a “service mpss start” into /etc/rc.local (and do a chmod +x /etc/rc.local) to do it automatcically upon reboot.
  • If against all warnings you decide to go with another linux flavor, release version, differnt driver version, etc: I won’t even reply… :-/

Either way – that’s pretty much it. Should work out of the box.

With that: Happy mining!

 

Published by

lukMiner

To learn more about me, look at the "About" page on http://lukminer.org

13 thoughts on “(luk-)Mining with x100 “Knights Corner” Xeon Phis (3110, 5110, 7120, etc)”

  1. Thanks for your job, I have 2 7120 and I’m using the knc-native miner, it work very well, but sometimes I get this message:

    semi-fatal error in serving stratum 1: could not read from socket!
    … cancelling active job and reconnecting
    Segmentation fault

    And the miner stops. How can I fix this?

    Like

    1. Sigh; yes, that’s a known bug, I still haven’t found the time to fix it. Easiest it to simply put a ‘while [ true ] ; do ./luk-…. ; done’ around it (then it’ll simply restart after every core dump. Also, if you use 0.9.1 you can also use the mpss-offload miner, that should be easier than the native one.

      Like

  2. Thanks for the tip.
    I run the miner for 2.5 hours now and suddenly the pool stops receiving my shares, but the miner keeps submitting shares, it seems the miner is mining only to your pool (I can recognize the share because of the difficulty, your pool has 20.000 mine 15.000). In the last 30min the miner mined only for you 😦
    PS: my mining pool is working, I have other miners running well.

    Like

      1. Outch. I’ll have a look tonight, after work.

        And no prob with the bad news: I’d rather hear about it early (so I can fix it), than somebody starting to believe that his was intentional … :-/

        What confused me is that I recently switched (most) of my own machines over to 0.9.1 as well, and I don’t see this any more (and yes, I did see it myself in 0.9). Anyway; I’ll have a look a soon as i can! Thanks for reporting!

        Like

  3. [20:11:22] ##################################################################
    [20:11:22] # this is lukMiner v0.9.1 (for Monero), starting up #
    [20:11:22] # Release Notes: #
    [20:11:22] # – more fixes to deal with ‘challenging’ pools (aka nicehash) #
    [20:11:22] # – added ‘–no-fail-on-malloc’ option that changes behavior on #
    [20:11:22] # failed huge-page allocations from error to warning #
    [20:11:22] # – changed scheduling interval between user and dev share #
    [20:11:22] # accounts. This doesn’t change devshare itself, but averages #
    [20:11:22] # it out more evenly) #
    [20:11:22] # – first windows version (distributed separately) #
    [20:11:22] ##################################################################

    [..]
    [20:39:18] submitting share w/ difficulty 20000
    [20:39:18] -> share *accepted*: 60/60 (100.00%) – total hashrate 654.15H/s (may take a while to converge)
    [20:39:30] *** total hash rate: 655H/s (may take a while to converge)
    [20:39:47] knc thread #45: share FOUND (context 0, nonce 0x000117D8)! (hash rate this thread = 2.68207H/s)
    [20:39:47] submitting share w/ difficulty 20000
    [20:39:47] -> share *accepted*: 61/61 (100.00%) – total hashrate 653.88H/s (may take a while to converge)
    [20:39:52] *** total hash rate: 653H/s (may take a while to converge)
    [20:40:01] knc thread #136: share FOUND (context 3, nonce 0x00013B4B)! (hash rate this thread = 2.68375H/s)
    [20:40:01] submitting share w/ difficulty 20000
    [20:40:01] -> share *accepted*: 62/62 (100.00%) – total hashrate 652.27H/s (may take a while to converge)
    [20:40:13] *** total hash rate: 652H/s (may take a while to converge)
    [20:40:33] *** total hash rate: 655H/s (may take a while to converge)
    [20:40:43] semi-fatal error in serving stratum 0: could not read from socket!
    … cancelling active job and reconnecting
    [20:40:43] connecting to pool pool.minexmr.com:5555
    [20:40:53] *** total hash rate: 655H/s (may take a while to converge)
    [20:41:13] *** total hash rate: 652H/s (may take a while to converge)
    [20:41:22] watchdog says everything is OK: accepted 12 shares since last check …
    [20:41:33] *** total hash rate: 654H/s (may take a while to converge)
    [20:41:54] *** total hash rate: 655H/s (may take a while to converge)
    [20:42:11] knc thread #76: share FOUND (context 5, nonce 0x00028775)! (hash rate this thread = 2.68551H/s)
    [20:42:11] submitting share w/ difficulty 20000
    [20:42:11] -> share *accepted*: 63/63 (100.00%) – total hashrate 653.65H/s (may take a while to converge)
    [20:42:13] new job at difficulty 15000 (acct #0)
    [20:42:14] *** total hash rate: 652H/s (may take a while to converge)
    [20:42:17] new job at difficulty 15000 (acct #0)
    [20:42:17] new job at difficulty 20000 (acct #1)
    [20:42:18] new job at difficulty 20000 (acct #1)
    [20:42:34] *** total hash rate: 653H/s (may take a while to converge)
    [20:42:48] knc thread #29: share FOUND (context 5, nonce 0x000045D5)! (hash rate this thread = 2.67889H/s)
    [20:42:48] submitting share w/ difficulty 20000
    [20:42:48] -> share *accepted*: 64/64 (100.00%) – total hashrate 653.59H/s (may take a while to converge)
    [20:42:54] *** total hash rate: 654H/s (may take a while to converge)

    Like

    1. Oh-kay; I _think_ I can finally reproduce – apparently this is related to a bug that I already fixed going from 0.9.0 to 0.9.1, but which I (apparently) only fixed on the CPU, not for the MPSS offload thread. In other words, it _only_ shows up with the mpss offload, and only in “-t 0”, which is why I couldn’t reproduce it until I physically sat in front of the “right” machine.

      Either way – if it is what I think it is it’s easy to fix. The only bad part about that news is that this particular bug can _not_ explain why it _never_ switched back to the user – it could get stuck in one job for minutes, but not forever; and it was just as likely to remain stuck with a user job than with the dev share job. Either way, I’ll have another look – but can only do this late at night when I’m back to that particular machine…. apologies.

      Like

  4. I was using the native version of the miner right on the coprocessor without any “-t” argument. Is there any test I can do to help you out?

    Like

    1. Hey, Digi – see my latest post: I am absolutely speach-less with embarrassement, but you were absolutely right: the MPSS version did indeed get stuck in dev shares: it printed the right hash rate, but a copy-n-paste error means it never reported this executed work back to to the accounting code, so eventually the accountant said “we need more dev shares”, which of course it never saw … Ugh. If you did run for a significant amount of time let me know; I’ll glady reimburse you for what you must have mined for my account. Apologies, apologies, and apologies again …

      Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s