2019 March 05 (Adam) After many starts and stops writing this entry, it is finally time to tell the story of Grand Prix Kansas City 1999. I added what I could salvage from this tournament to the site about a year ago—I have rounds 4 through 12 of the twelve rounds of Swiss, and a scattered few matches from the first three rounds. You may recall the puzzle I had to solve in order to reconstruct the Saturday matches of GP Philadelphia 2000. That was like trying to replace the batteries in an antique watch. Kansas City was like trying to put the watch back together after it was thrown into a muddy puddle and then run over by a truck. Fair warning, we're going to have to talk about not just usual tiebreakers, but second- and third-order tiebreakers soon. For context, coverage has come full circle in the two decades since gpkc99 (March 27-28, 1999). At the time Wizards's official site would have coverage ("cybercasts") for some tournaments but not all; pairings and standings for some GPs in this era were preserved because they appeared on third party sites. That might have been the tournament organizer, which is the case for gpkc99: the relevant information appeared on New Wave's site. Other GPs, like Seattle 2000, had text coverage hosted on the Dojo. Some of those pages migrated over to the official tournament archive, but not all of them; the official coverage link for gpkc99 points to a page that only has a text recap of the top 8. I've dug around quite a bit looking for references to other third-party pages that might have coverage on them, but haven't yet found any others that made their way onto the Wayback Machine. There's a small but nonzero chance that, for example, some of the missing European or Asian GPs once had coverage posted somewhere and I haven't had the good fortune to stumble upon them. New Wave's coverage archive has the suggestion of pages for other tournaments they hosted. The jewel among them is GP San Francisco 1999. Unfortunately the New Wave archive was only discovered by the Wayback Machine once, and almost all of the SF results pages timed out, so there isn't really any hope to reconstruct that tournament like we're about to do here. I wonder sometimes whether the .html files are still out there somewhere... I got the sense that the New Wave coverage was largely spearheaded by Alex Shvartsman, much as the Dojo coverage seems to have been orchestrated by Mike Flores in places. Is there a chance that gpsf99 is on a zip disk in his garage somewhere? I have similar questions about Pro Tour New York 1998 and Pro Tour Chicago 1998—those tournaments were at one time 100% on the internet but they didn't migrate to the current version of Wizards's coverage archive, and the Wayback Machine didn't capture every page from those two events. Are they sitting on a backup tape drive in a filing cabinet in the basement of WotC headquarters? As for Kansas City, the pages that were on the internet were captured but not everything was posted in the first place. Coverage starts with round three standings on day one, and with round 8 standings on day two. So I have pairings for rounds 4, 5, 6, 9, 10, 11, and 12; and I have standings (including all tiebreakers) for rounds 3, 4, 5, 6, 8, 9, 11, and 12. Yes, the GP only had twelve rounds of Swiss, six on each day. The day two cut was to the top 64, which worked out to everyone at 5-1 or better and one lucky person at 4-0-2. There was a second, much less lucky, person in 65th at 4-0-2. You could have shown up for this event with three byes, lost two matches, found yourself dead for day two, and dropped having played less Magic than you would have played at FNM. Also note that awkwardly round 10 standings are missing: the page exists but evidently the data was corrupted back in 1999. You'll notice that I only said "pairings," not "results." This is not a deal-breaker: we can make inferences as to the results of matches based on how the numbers of match points change from round to round. (Actually, the pairings themselves had lines like Finkel, Jon (414) 27 1 Rubin, Ben * (456) 24 saying that Finkel, with 27MP, played against Rubin, with 24MP, on table 1. So the match point information is actually preserved with some redundancy.) But I also didn't say anything about the pairings in rounds 1, 2, 3, 7, or 8. They are missing. This is a big issue. It is also in some sense the opposite of gpphi00, where we knew the pairings and didn't know the results. That's much easier than knowing the results without knowing the pairings. Without even having standings for the day one rounds, there's not really any hope of recovering those, so I dismissed the possibility of recovering rounds 1-3 immediately. However I thought there was a real chance of reconstructing rounds 7 and 8, with the information we had, so let's make that our goal. The top 64 made day two, so there should be 32 matches in both rounds 7 and 8. The first drops didn't occur until after round 8. This can be verified from the way the score reporter printed the standings. As an example, here's the top line from round 8's standings.  1 Finkel, Jon 24 75.8333 94.1176 71.7677 5/5/0/3 This says Finkel was in first place with 24MP; his tiebreakers were, in order, 75.83 94.11 71.76 (we'll learn about these soon); and then the last entry "5/5/0/3" says he played five matches, winning five, zero draws, and three byes. (From this you can deduce zero losses.) After round nine there are some people with lines like "7/4/0/1" implying they had dropped before round nine. But the round eight standings show everyone in the top 64 as having played eight matches. So at the start of the project we're 0/64 on matches deduced. That's a lie. Both round 7 (Paschover vs. Finkel) and round 8 (Price vs. Maher) had a feature match with text coverage, so there's two matches where we know who played whom. Also there are eight surviving tournament reports from archived snapshots of the Dojo which were written by people who made day two. Unfortunately a couple of the authors didn't know the names of some of their opponents, or elected not to include them. Still, after this free information, we are at 14/64. (Also those tournament reports got us a couple of matches from rounds 2 and 3... I'll take whatever I can get!) For several people we can see that their after-round-8 match point total is either six more than their after-round-6 match point total, or is equal to their after-round-6 match point total. In that case we know that they won both their matches or lost both their matches, but we don't know against whom. We also have some "loose ends," half-finished players for which we know one of their opponents but not the other. It's time to get our hands dirty. Magic tournaments track three tiebreakers. The main one, which is what is usually meant when someone just says "tiebreakers," is the average of your opponents' match win percentage. (This is called OMW, for opponent match win%.) For each opponent, calculate the ratio [player's MP] / [3 × rounds played], and then average those ratios. The caveat is that a number less than 1/3 gets replaced with 0.3333. Here's how we can leverage tiebreakers to discover something about missing opponents. This is Jon Finkel's opponents and their records after round 8. Note Jon had three byes.  R4 Jamie Parke 4-2 R5 Jacob Welch 5-3 R6 Gary Krakower 6-2 R7 Marc Paschover 7-1 [known from feature match] R8 [unknown] tiebreaker (omw) 75.8333 [known from R8 standings]  Okay, to be fair, we can deduce that Jon's unknown opponent is also 7-1 without doing math, since he should have been paired against another 7-0 player in round eight and he won. But still, let's see how this is done with tiebreakers. The information in the table above leads to the equation $$\frac{1}{5} \left( \frac{12}{18} + \frac{15}{24} + \frac{18}{24} + \frac{21}{24} + x \right) = 0.75833 \text{,}$$ where x is the match point ratio of the unknown player. Solving the equation gives x = 0.87499. We know in this case that x should be a fraction of the form y/24; solving y/24 = 0.87499 gives y = 20.9998, so up to a rounding error we get that the unknown opponent had 21 match points, so was 7-1 after round 8. Is this even good? There were sixteen people who were 7-1 after round 8, so all we know is that Finkel's opponent was one of those. (There are a couple we can rule out: it isn't Paschover, since they played round 7, for instance. A couple of them also have opponents known from tournament reports.) There are two ways we can proceed. We know Finkel's opponents going forward and we know the tiebreakers after future rounds, so we can learn extra information about the unknown round 8 opponent by looking into the future. Here's the situation after round 9. Note Krakower and Paschover make different contributions now than they did before since they played in round 9.  R4 Jamie Parke 4-2 R5 Jacob Welch 5-4 R6 Gary Krakower 7-2 R7 Marc Paschover 7-2 [known from feature match] R8 [unknown] R9 Lan D. Ho 8-1 tiebreaker (omw) 74.0741 [known from R9 standings]  A calculation like before says that the unknown player's match win percentage is 0.7777, so they're 21/27, or 7-2. This shrinks the pool of possible players from 16 down to 7, as now we need someone who went 7-1 into 7-2. Since we don't have round 10 standings we don't get information about the unknown opponent's record after R10, but we can learn R11 and R12 from the extant data. This "signature" of a player's R8, R9, R11, R12 records often identifies them uniquely, or at worst will make them a member of a set of at most two or three people. It's possible that there will be two people with a given signature but we know the R7 and R8 opponents for one of them, and if that happens then the fact that the signature wasn't unique won't actually hinder us. Let's look at the second way to accomplish this, with the second and third tiebrekaers. The second tiebreaker is your own game score: it's the number of "game points" you have earned divided by three times the number of games you played. Game points are like match points; you earn three points for a win, one point for a draw, and no points for a loss. As such a 2-0 win counts as 6/6 game points, a 2-1 win counts as 6/9, etc. Draws are annoying for game scores, since it depends on reporting the correct kind of draw. If you draw because game three didn't reach a conclusion, that's a 1-1-1 match result, so 4/9 game points. If you ID, that's an 0-0-3 match result, so 3/9 game points. If you draw because game two finishes in extra turns and you don't get to start game three, that's a 1-1 match result, so 3/6 game points. This never matters in practice, but in doing tiebreaker math I've noticed that occasionally draws are put in as 1-1 (3/6) instead of 0-0-3 (3/9) like they're supposed to be. The final tiebreaker is the average of your opponent game point percentages. (This is OGW, for opponent game win%.) As with match points, there is an artifical floor of 0.3333 imposed on your opponents who have own game scores below that percentage. The second tiebreaker will report a number less than 1/3, but the number that gets used in the third tiebreaker calculation will be inflated to 1/3. Let's look at Finkel again post round 8, this time examining the game scores of his opponents. Usefully the game scores can be read off of the round 8 standings, since those are the second tiebreakers. So we don't have to try to reconstruct game scores for all the previous matches in order to use the third tiebreaker.  R4 Jamie Parke 72.7273 R5 Jacob Welch 55.5556 R6 Gary Krakower 75.0000 R7 Marc Paschover 77.7778 [known from feature match] R8 [unknown] tiebreaker (ogw) 71.7677 [known from R8 standings]  This means that, like before, the opponent's game score percentage solves the equation $$\frac{1}{5} \bigl(72.7273 + 55.5556 + 75 + 77.7778 + x\bigr) = 71.7677\text{.}$$ The solution is x = 77.7778. Now we are looking for someone whose own game percentage (second tiebreaker) is 77.7778 and who has 21 match points after round 8. There are only three such people: Tony Tsai, Craig Dushane, and Marc Paschover (who is ineligible to have played Jon round 8). We can go deeper and reach a conclusion now: Tsai entered day two at 6-0 and Dushane entered at 5-1. We know that Jon played another 7-0 and beat them, since Jon winds up on 24MP and tiebreakers showed his opponent ended up at 7-1. Therefore only Tony Tsai could have been Finkel's opponent. It took seven paragraphs and ~1000 words, but we now are 15/64. Did you notice that there was something special about Jon Finkel that made the calculations possible? Puzzle that for a paragraph. To recap, there are two different pairs of information that can help shed light on unknown opponents: we can use the combination of the first tiebreaker (OMW) together with players' match points, or we can use the combination of the third tiebreaker (OGW) with the second tiebreaker (the games equivalent of match points). Using these, we can build a signature of the unknown opponent's record in future rounds. Eventually hopefully this process will narrow down the set of possibilities to one player, or at least one player among those that are unaccounted for. We started with several "loose ends" since we knew only one of the two opponents for several players who happened to have played one of their rounds against someone who wrote a tournament report. We pray that filling in loose ends will create other loose ends and we will eventually untangle all 64 missing pairings. The thing that was special about Jon Finkel is that he had three byes, so we had otherwise total knowledge about all his other opponents. Let's jump from Finkel to Tony Tsai now. He only had two byes. Here's what we know about his tournament so far. (Remember, annoyingly, we don't have R7 standings.)  R8 record R8 game pct R3 [unknown A] R4 Danny Speegle 3-2 60.0000 R5 Mike Caselman 4-2 56.2500 R6 Devon Herron 6-2 68.4211 R7 [unknown B] R8 Jon Finkel 8-0 94.1176 R8 tiebreakers 71.1111 66.2500 [omw / ogw]  How can we make progress when there are two variables in the equations? Don't forget that we have information after round 6, too! The best possible result for us is if Tony's round 3 opponent did not make day two. If that's the case, then the record of the unknown round 3 opponent will not change between round 6 and round 8, and the first line of blanks will get filled in, ready for use in the round 8 calcuations. With this in mind let's strip off rounds 7 and 8 and look at the end of day one standings.  R8 record R8 game pct R3 [unknown A] R4 Danny Speegle 3-2 60.0000 R5 Mike Caselman 4-2 56.2500 R6 Devon Herron 5-1 71.4286 R6 tiebreakers 65.0000 60.8085 [omw / ogw]  From this table we infer that Tony's round 3 opponent had a .5000 match win percentage (either 2-2 or 3-3, we can't tell but it doesn't matter) and a game win percentage of .5556. Usefully, they did not make day two. So their contribution isn't going to change between rounds 6 and 8. We can go back to the first table and fill in the static information about unknown A, leaving us only with unknown B to consider.  R8 record R8 game pct R3 [unknown A] 2-2, say 55.5556 R4 Danny Speegle 3-2 60.0000 R5 Mike Caselman 4-2 56.2500 R6 Devon Herron 6-2 68.4211 R7 [unknown B] R8 Jon Finkel 8-0 94.1176 R8 tiebreakers 71.1111 66.2500 [omw / ogw]  This table implies that unknown B was 6-2 after round 8 and had a 63.1555 game score. Only one person fits that bill: Justin Holt. Since both entered day two at 6-0 and Holt is now 6-2, the result of the round 7 match was a win for Tsai. 16/64! (As a footnote, nobody after round six had nine match points and a 55.55 game score. But there were fourteen people who were 2-2 with that game score. Since they dropped after round 4 there's not much hope of figuring out who they were.) What would we have done if Unknown A had made day two? I think the only logical options are panic and despair. The problem in that case is that the contribution that Unknown A would have made to the round 6 tiebreakers will not match the contribution they make to the round 8 tiebreakers, so learning where they were after round 6 isn't particularly helpful. Many of the players who made day two had one or zero byes, so in place of our single mystery player we would calculate from the round 6 standings a sum of two or three mystery players' statistics. If the stars align and none of them made day two, then their agglomerated tiebreakers will contribute the same amount towards round 8, and we can then isolate the single missing person just like what we did for Tony. You should probably be asking right now, if we're in a situation where there are multiple unknown day one opponents getting clumped together, how would we know whether any of them made day two in the first place? It shows up when you try to calculate the unknown day two opponent's information from tiebreakers. We're expecting to see match win percentages of .8750 for a 7-1 record, .7500 for a 6-2 record, or .6250 for a 5-3 record. (A couple of players have draws, but excluding those for now these are the only options. Two players were 8-0 and we have them taken care of.) Suppose we infer a match win percentage of .8525; that would be 20.46 match points out of 24. That's bad news. A result like that means that something is wrong upstream—someone from day one is making a different contribution to round 8 than they did to round 6. Unfortunately that player's tiebreakers are then useless, since we can't isolate the signature of their unknown day two opponent. I didn't calculate tiebreakers for everyone, since both opponents were already known for whatever reason for several people at this point. Of the players I did calculate, eleven had useless tiebreakers. This adds a level of suspense to our excavation effort, since at the bottom of our well is now a swill of uncertainty. I mentioned draws in the previous paragraph. There were six people whose round 8 match points differed from their round 6 match points by 1 or 4. Those people had to have played each other in at least one of their matches. For a couple of them one of their non-draw result was known, and so then that forced their other match to be a draw. The location of the six draws was comparatively easy to isolate. You may recall that our over-arching plan was to pull on loose ends (players for which one of their opponents' identites is known) until our knot untangled. I have sad news: we won't get to the finish line this way. At some point in the high 30s I got stuck; all the loose ends involved people with useless tiebreakers, so I needed one new idea to get to the end. Let's look at Eric Lauer, who had three byes but didn't have either of his opponents' identities uncovered up to this point. He goes from 5-1 after round six to 6-2 after round eight.  R8 record R8 game pct R4 Brent Parr 7-1 75.0000 R5 Devon Herron 6-2 68.4211 R6 Joel Noble 4-2 61.5385 R7 [unknown A] R8 [unknown B] R8 tiebreakers 75.8333 66.2551 [omw / ogw]  The goal is to try to tease apart the two missing data points from their sum. For match win percentage, the contribution of A+B is 1.5 in aggregate. Multiplying by 24 tells us that A+B had 36 match points altogether. Either they were both 6-2 or one was 7-1 while the other was 5-3. Assuming that there aren't any pairdowns, the first situation can't occur! This is because Lauer either goes WL or LW. If it's WL, then opponent B plays him in round eight where both are 6-1, and Lauer loses, so B winds up 7-1. Otherwise opponent B plays Lauer in a round eight match where both are 5-2, and Lauer wins, so B winds up 5-3. There aren't many 7-1 slots to go around at this point, so this is possibly useful already. Even more powerful is to look at the aggregate contribution of the game win percentages. The contribution of A+B to OGW is 126.30. I then wrote a program in Python to look at every possible way that two own game scores (second tiebreakers) could add up to 126.30, and it turns out that the only pair among the ones that were left at the time is 68.42 + 57.89, and only Jeff Matter has a 57.89 second tiebreaker. Even better, only John Lagges has the combination of a 68.42 second tiebreaker plus a 7-1 record. (Nobody has 68.42 + 5-3.) So now we know that Lauer plays Matter + Lagges in some order. This potentially gets us un-stuck, since now both Matter and Lagges are "loose ends" as one of their opponents is known. We just don't know whether Lauer plays them in round 7 or in round 8. Further down the line this thread of reasoning hooked into someone with useless tiebreakers, for which one of their opponents was already known. That then snapped everything we had done so far into place. These ideas plus a lot of patience were able to determine all 64 matches. My first attempt at this didn't go well because I think I made some pretty shaky logical conclusions from useless tiebreakers somewhere early on in the process. For my second, successful attempt, I tried to be meticulous in note-taking so that I would have multiple save points in case something went south. Here's my main thread of notes (PDF), containing the 64 matches deduced in order. Here you can see my furious scribbling (JPG) trying to work out information about unknown opponents; this goes on for several pages. In the image you can see me working out the records for an unknown opponent in future rounds (boxed in each table). Sometimes I'm able to figure out the identities. For others, OGW calculations had it limited to a couple of people before I started—you'll see in the table in the middle that Ferguson's R8 opponent was either Stanton or Lewis, and the fact that the unknown opponent made OMW contributions of 18/24 and 18/27 in rounds 8 and 9 meant it must have been Lewis. Most of these calculations wound up in a spreadsheet that I used to track my progress. I should add that there are two other places where I had to use this technique to recover lost pairings: GP Kuala Lumpur 2000 round 10 and Pro Tour Los Angeles 1998 round 4. These were significantly easier due to (a) having total information about all previous rounds and (b) only needing to reconstruct one round instead of two consecutive rounds. ptla98 R4 is the only one of these that took place on day one, so at the lowest tables we are trying to determine identities of players who had 0-3 records. This is typically impossible because the .3333 floor artifically obfuscates players' identities. Still, I was able to recover 156/164 matches, which I'm treating as a win. I believe that I could reconstruct all of the missing days of ptny98 and ptchi98 if I had the standings after each round, but sadly the standings are on the tape drive backup in the basement right next to the results and pairings. I'm hoping I never have to do this again, though if it means more data on the site and we come across data that needs to be rebuilt I'm absolutely up for the challenge.

 2018 December 14 (Adam) I finally got to the bottom of my pile of grading and that means it's time for World Magic Cup stats! Data in this table covers everything that's on the site—individual pro tours and Grand Prix dating back to mid-1998, with data getting spotty in late 1999. (See the FAQ.) You can sort this table by clicking on a column heading. Note that with only three people on each team, the middle rating is the median. If you'd prefer to limit the sort to only the teams with three rated players, here's the average sort with those teams filtered out. Mouseover a rating to see the name associated with it, or click on a country to make the names appear. The blue rating is the team captain and the red rating is the national champion. For some countries those two people coincide (hence the purple), while for some others either the pro points champion or the national champion declined their invite. Best of luck to all participating teams!

 2018 September 13 (Adam) The World Championships are nigh! I've been posting stats to Twitter recently but haven't done a good job cataloging them here. My apologies... it's easy to feel like I'm done after I've fired off the tweets, but I should really do a better job cataloging my statistics projects in this space for people who aren't following me there. (Having said this, I do feel like the percentage of stuff I produce that is of a quality to appear here is not 100%, so if you want to see some half-baked numbers you might try following @ajlvi on Twitter.) You can find the following information on the twenty-four Worlds participants: a lifetime head-to-head grid (Reid Duke's 45-32 record against the other 23 players is pretty impressive), a breakdown of each player's sesaon (basically what you'd hope to see on the back of each player's baseball card in the "2017-18" row), elo-based metrics for each player (Brad Nelson's 2222 average Elo throughout the year is quite eye-popping), and the ever-controversial, just-for-funzies, have-I-ever-told-you-Elo-is-a-crude-tool results of simulating the tournament 2.5 million times using either Elo or average Elo as the only determiner of wins and losses. Of course the probability of winning will line up with the list of everyone's Elo in descending order, but the interesting thing to pay attention to here is the margins — it gives you some idea as to little a deal one rating point is. If I were a bookmaker I wouldn't set betting lines based on these simulations, but I would use it to remind myself that the margins are going to be very thin this weekend. Good luck to all!

 2018 July 25 (Adam) Here's a challenge for you — the kind of task you'd have to solve on an Elo project job interview. I'm recording it here so that I know where to find it in the future. At Grand Prix Chiba last weekend two different people with the name Ryo Takahashi registered for the event. Both of them went 6-2 on Saturday and advanced to day two, when someone realized that there were two people with the same name. In Sunday's rounds the two players have the last four digits of their DCI numbers attached to the end of their names so that they can be told apart. But if you look at, say, the round 4 results, you'll see two people with the same name. Using the tiebreakers of the two players (and some of their opponents) you can figure out whose day one results are whose. See if you can accomplish this. The correct answers are here on the site if you want to check your work, and of course I'd be happy to provide explanation if you want to know how to do this. It took me about twenty minutes to disentangle the two players' results, and my guess is that if you have never tried to do anything like this with tiebreakers before that you're going to need to set aside at least an hour to figure it out.

 2018 June 07 (Adam) I became aware last week that the version of Wizards's site that I thought was the oldest one that was archived by the Wayback Machine was in fact not the oldest. An older version from the 1998-99 era included coverage ("cybercasts") of PT Chicago 1997, PT LA 1998, PT NY 1998, and PT Chicago 1998. Unfortunately these pages were only trawled a couple of times: incompletely in early 1999 and unsuccessfully in late 1999. When the site was redesigned sometime in 2000, the cybercasts were not ported, and all future sites have copied off of the information available in the 2000 version. I also saw rumors in a post on the Dojo that Worlds 1997 may have had some sort of internet coverage, but it predates even this older version of the site. Here's the status of those PTs: ptchi97 is relatively complete, but day one information was never posted in the first place because of technical issues at the tournament site. The cybercast only consisted of standings, so it would be a challenge to try to get pairings out of nothing. The three from 1998 are more promising: pairings and standings once were on the internet. ptla98 is intact except for round 4, but the Wayback Machine failed to capture any of ptny98 day two or ptchi98 day one. (There was one intervening event, PT Mainz 1997, which didn't have a cybercast at all.) I've reconstructed 156/164 matches from ptla98 R4 from tiebreakers, and that tournament is now on the site. I don't think there's hope to reconstruct the others from what's available. I can almost imagine getting ptny98 back together if we had the final standings and tiebreakers; day two rounds have a small number of matches and the tournament reports that exist will fill some of it in, which might give me a toehold. On the other hand ptchi98 is missing day one, which is a much bigger disaster—the rounds are bigger and people will go 0-4 drop which means they won't show up in tiebreakers except as the minimum .3333. This makes it impossible to recover their matches from the standings. Still, I thought that after I added Kansas City 1999 that I wouldn't have any more old data to add, so getting another Pro Tour is pretty cool! Now I'm aware of five tournaments that were once on the internet that aren't on the site. Here's hoping that someday I'll get extra information which lets me rebuild them! Next post (which may not be for another month or so) I'll discuss the reconstruction effort of gpkc99.

 2018 March 08 (Adam) I promised a while ago to talk about the reconstruction effort I underwent to recover Grand Prix Philadelphia 2000. Fair warning: things may get a little technical ahead. In a round of a typical tournament, three pages of information are generated by the event reporter: a list of pairings at the beginning of the round, a list of the results of each match after all the match results are put into the system, and the standings as of the conclusion of the round in question. For our purposes, it's the middle one of those three that's the most useful, because we need two pieces of information for the dataset: (a) who played whom and (b) what the match result was. The results page just tells us that straightaway. Strictly speaking the results page is a convenience, since the information in it can be reverse-engineered from the rest of the coverage. If you know everyone's match point total as of round N-1, and you know the pairings for round N, and you can see the standings after round N, then we can figure out the results from round N. A player won her match if her match point total after round N is three more than that after round N-1, lost it if that difference is zero, and drew if the difference is one point. Sometimes the results pages are corrupted in some way (the most typical error is the round N results page being the same document as the round N-1 results page) and I use this method to recover the data for the site. You can imagine I wasn't impressed with the coverage page for GP Philadelphia 2000: none of the rounds have a results page at all, and the first time we even see standings is round 6. This means for round 7 onward we can recover the results by using the method outlined above. (Round 6 doesn't work because I don't know the starting number of match points — those would be in the round 5 standings page.) Then I crossed my fingers, because sometimes the pairings pages include the MP totals. These don't. All I knew about the first six rounds are the pairings. Would that be enough to recover the results? On the face of it that may sound crazy, but there's reason to believe that there may be enough data here to figure everything out. The results for some people will be immediate from their match point total: if they have 18 match points won every match they played and if they have 0 match points they lost them all. This will distribute some losses to people who played the 18MP players and some wins to the people who played the 0MP players. Maybe after that sweep is done we'll have assigned a loss to some people with 15MP (= 5-1 record), so we'll know they won all the rest of their matches, or maybe we'll have uncovered a win to someone with 3MP, so they'll have had to lose all the rest of their matches. (Note that 3MP could have been a record of 0-x-3 or 1-x, but since we've found a win for that person, their quantity of points left to assign is zero.) Then we get to go back and take a second pass, looking for byelines that can be completely filled in. In a perfect world, this initial cascade might fill in all six rounds. There were 582 people in the tournament, and the successive passes filled in 86, 52, 34, 28, 16, 13, 7, 6, and 2 people, for 244 total. That's something, but not everything. Most of the other players had some matches filled in, just not all of them. As an example, after my first sweep my Python structure had an entry of the form Lowery, Brett 12 .W..L. [12, 0, 3, 12, 12, 9] meaning Brett had 12 MP after round 6, with a win round 2 and a loss round 5 already accounted for. The list at the end stored the match points of all six of Brett's opponents. The possible results a player could have were W, L, D, B(bye) and X(drop). The pairings pages told me who had a bye in each round, so I at least had that going for me. A player dropped when he stopped appearing in the pairings. Thankfully nobody left and came back somehow. The goal now was to find ways to get myself "unstuck". If I could puzzle out an individual player's results somehow, then we could resume the cascade; even filling in one match might lead to settling a substantial number of players. The big cache of information that I've left untouched so far is the fact that the pairings are done by the Swiss system, meaning the identity of your oppoents encoded some information about your record at the time of each match. I'll try to illustrate with examples some of the techniques I used to tap into that data. I believe the list below is exhaustive in the sense that, by applying the observations below, together with cascading, was enough to recover all the results. Look again at Brett's line above. The win already credited to him round 2 turned out to be against someone who ended the tournament with zero match points. Brett's round 2 opponent was definitely 0-1 after round one, so if they played each other Brett (almost certainly) was 0-1 himself. Therefore I credited Brett with a loss round 1. I should address here that there's of course the possibility of a pairdown. I made the simplifying assumption that there were no three-point pairdowns, since there were always people with draws intervening. For instance, in round two if a pairdown was necessary then there should have been a 3MP-1MP match and a 1MP-0MP match instead of a 3MP-0MP. If this assumption is violated and people can be paired across brackets, I'm afraid what we're trying to do becomes more augury than science. Here's another player's line: Magby, Mike 12 B...L. [xx, 7, 12, 12, 15, 9] Mike's round two opponent had a draw somewhere in the tournament. But it wasn't against Mike, because the only way to reach 12MP with a draw is by going 3-0-3. Using this exclusion principle, I checked each person with a draw to see if exactly one of their undetermined matches could have been a draw. Notice that "eligible to have drawn" is something that depends on how many match points are left to be assigned; a hypothetical person with 10MP after round 6 and an uncovered history of .DL.W. won't have any more draws, so they definitely didn't draw with their round 4 or round 6 opponents. This logic eventually uncovered that two people with 9MP had to have had a draw; they both were 2-1-3 after round six. Thanks for spicing up the project, guys. Suppose after round six you have P match points and your round six opponent has P-3 match points. Then you won round six. (Again, assuming no three-point pairdowns.) Similarly if you and your round six opponent wind up with exactly the same number of match points as of round 6, then you drew round 6. If the discrepancy between your round 6 total and your opponent's total is 1 or 2, then there was some sort of a pairdown; I didn't try to assign a result to a match like that at this stage. This logic applies to Mike Magby (above), who won against his round 6 opponent. The logic of the previous item can be extended. Suppose you dropped after round 5 with P match points, and your round 5 opponent showed up in the round 6 standings with P+6 match points. Then you played that opponent in a P-P match, the opponent won and went to P+3, and then they won again round 6 and went to P+6. In short, they finished WW, and you ended LX. You can go yet another step here and consider people who dropped after round 4; if their opponents ended round six with nine more points than the dropped player, that opponent finished WWW. Maybe there was one other item that I've forgotten about, but I believe these were the only methods that I used to fill in every result from the first six rounds. I was a little astonished at the end that everything was not only filled in, but also internally consistent; I think that illustrates how much information is already contained in the standings. My goal was to use the lightest touch necessary to recover all the results; I'm sure there's other ways to draw the same conclusions, but I wanted a set of axioms that would let the rounds fill themselves in as much as possible. This way if something went wrong there would be a more limited place to look for inconsistent hypotheses; this is especially valuable since future deductions depend on previous work. Unfortunately for the other big reconstruction project (GP Kansas City 1999) things need to be done more by hand. More on that job another time. I should address the question about whether the results I reconstructed are unique, or if there's some other way to fill in the grid that would assign everyone the appropriate number of match points. This mainly depends on whether there was a three-point pairdown really early in the process, since future deductions are based on previous results. I'd be somewhat surprised if what I came up with wasn't an exact match to historical fact, or at least was really close to it, so I ultimately decided to include the reconstruction on the site. It would be nice to try to reconstruct the data in a different order to check for discrepancies, but I admit I'm not optimistic that I'm going to have the time or motivation in the near future. If anyone else wants to torture themselves and go through this, though, I'd be happy to compare our results!

 2018 March 04 (Adam) I've been good at updating the site but not so good at recording my updates here. Since the last blog post, here's what's happened: Almost all available old data is on the site. There's a few things missing, mainly GP Kansas City 1999, which I need a block of several hours to work on. But practically everything that's ever been on the internet is now on the site. See a new FAQ item to see the timeline of early events. Roughly fifty tournaments from 1996-1999 are missing; I believe most of them had no internet presence even at the time. That means 92.5% of Magic history is here in terms of events, and probably closer to 98% in terms of total matches. (Unfortunately the 2% that's missing includes some relevant information for correctly rating some early stars like Steven O'Mahoney-Schwartz, Jon Finkel, Randy Buehler, etc.) We added a player search to the win percentage by format page to complement the leaderboards there. (This was one of the most requested features, so I'm glad we got it done!) Tables for top 8 likelihood by record now exist for Grand Prix. These are updated after every tournament. I've also posted to Twitter a couple of stats projects, like this sheet of cumulative records in knockout rounds. It's been added to the stats hub. Because it includes team data it's pretty unlikely that it will get incorporated into the site, but I'll try to update it once a month or so. More soon, of course! We've cleared off a few of the highest things on our queue but the list of things we'd like to do is still pretty long.

 2017 November 29 (Adam) It's time for another World Cup! It astounds me how much the site has grown — last year when I was making this table I didn't even have three years of data to work with. Now I have fourteen. I did my best to find the participants but some of the national teams appear to have people that have never played in a GP or PT. They've been colored gray and given the starting rating of 1500. Some ratings may differ slightly from players' personal pages because they incorporate corrections and/or data from 2003-05 which hasn't been integrated into the site yet. (More on that next week!) You can sort this table by clicking on a column heading. Note that with only three people on each team, the middle rating is the median. If you'd prefer to limit the sort to only the teams with three rated players, here's the average sort with those teams filtered to the top. Mouseover a rating to see the name associated with it, or click on a country to make the names appear. The blue rating is the team captain and the red rating is the national champion. Best of luck to all participating teams!

 2017 October 04 (Adam) The World Championship is this weekend! In preparation I've put together a couple of pages of stats. I tweeted them out earlier this week, but so that they're all in one place, here's some links: Head-to-head grid for all twenty-four competitors Each player's resume (stats on their season) Elo-based metrics on each player's season tournament simulations and win expectancies The history of small Worlds (Google doc) (record by year for each player that's been invited)

 2017 September 10 (Adam) Another couple of years have been added to the site. Thanks to work I did for Bob Huang's series of articles on CFB I had already put together data for GP Philadelphia 2005, so I did a little bit of 2005 to reach that GP specifically (November 12, 2005). I think we'll run out of useable information in about two more batches. My goal is to do the next one by the end of October, but that might be pushing it.

I tweeted this chart out when I updated with GP Denver but I realized it belonged here as well. Brad Nelson has had a crazy last four GPs: an undefeated win in Omaha, then a 6-0 drafting performance for 13-2 in Kyoto (he finished 11th on breakers), then he reached the top 8 of Minneapolis (lost in quarters), and now he's won again in Denver. Making top 8 in three GPs out of four attended is pretty rare — only eleven people have done it (some multiple times) and it hasn't been done in two years.

Made top 8 in three Grands Prix out of four attended. (bold: top 8, blue: win)

namegp #1gp #2gp #3gp #4
Jonathan Sonnegpphi05gpcha05gpric06gptor06
Quentin Martingphass06gpcar06gpkl06gpath06
Kenji Tsumuragpkl06gptoul06gpstl06gphiro06
Shuhei Nakamuragptoul06gpstl06gphiro06gppho06
Jelger Wiegersmagpbar06gptor06gptoul06gpmal06
Andre Coimbragpmal06gphiro06gppho06gpath06
Klaus Joensgptorin06gpsto07gpstra07gpfir07
Paul Cheongpdal07gpcol07gpmon07gpsf07
Paul Cheongpsf07gpkra07gpday07gpvan08
Yuuya Watanabe (4/4)gpban09gpnii09gppra09gpmel09
Yuuya Watanabegppra09gpmel09gptb09gpkit09
Shota Yasookagpkob11gpsin11gpsha11gpbri11
Owen Turtenwaldgpatl11gpar11gpden11gpdal11
Owen Turtenwald (4/4)gpden11gpdal11gppro11gpsin11
Yuuya Watanabegpkan11gpsha11gppit11gpmon11
Paul Rietzlgpsea12gpmc12gpslc12gpana12
Yuuya Watanabegpkob12gpkl12gpmani12gpyok12
Sam Blackgplou13gpwdc13gpabq13gptor13
Jeremy Dezanigpvie13gppra14gppar14gpvie14
William Jensengpphi14gpatl14gpchi14gpdc14
Pascal Maynardgpott14gpoma15gpmex15gpvan15
Martin Juzagpman15gpshi15gpmex15gpsev15
Paul Rietzlgpsd15gpokc15gpwis15gpind15
Andrew Cuneogplv17-limgptor17gpind17gpdc17
Corey Baumeistergpmin17gpden17gpdc17gpphx17
Corey Baumeistergpden17gpdc17gpphx17gpatl17

Note that Owen and Yuuya are the only two to have a streak of four individual GP top 8s in a row, and both of them were part of stretches of five out of six!

(This table was updated on November 17, 2017.)

 2017 August 17 (Adam) Piggybacking on the script I wrote for the previous chart, there's now one for expected number of pro points based on your record. Interestingly, the "0-0" box reads 4.4; of course it slowly dwindles down to 3.0 as the number of losses ticks up toward eight. This says that a PT appearance is on average worth 4.4 pro points, and thus gold status is worth about 17.6 just from the four PT berths.

 2017 August 01 (Adam) I decided that instead of updating the chart in the post below with the results from PT Hour of Devastation, I should make a separate page for it which I'll update after every Pro Tour.

Yesterday I worked on recreating my favorite FiveThirtyEight infographic using the data we've collected from the site. If you're following me on Twitter, you probably saw some escapades as I tried repeatedly to get this right, and even the final image that I posted wound up slightly off. >.< The biggest culprit was the play-in rounds of Pro Tour Kaladesh and Pro Tour Aether Revolt, which deeply confused my script that attempted to figure out who was in the top 8 of a given event. Second-biggest was PT Kyoto 2009, which apparently only had 14 rounds of Swiss, and my program then gave everyone a bye in rounds 15 and 16. ^_^; Oops.

Here is, as far as I can tell, the correct table. This graph tabulates the percentage of players with a given record that have gone on to make the top 8. Data comes from all 16 round split-format PTs (ones with both draft and constructed). That's the last 28 tournaments. Some multi-draw columns were omitted due to small sample sizes. You can now mouseover the cells (or tap them on mobile) to see the data; the tooltip shows {the number of people who made top 8 after having this record} / {the number of people who have had that record in total}.
 at conclusion of round... x-0 x-1 x-2 x-2-1 x-3 x-3-1 x-4 16 LSV1/1 1001/1 1001/1 10017/17 10011/11 99116/117 2335/150 15 1002/2 1002/2 10012/12 10045/45 94112/118 4716/34 7.830/384 14 1002/2 1007/7 9858/59 7610/13 55115/207 3314/42 2.515/591 13 1002/2 10018/18 7977/97 387/18 29101/339 1912/62 0.605/835 12 1004/4 9632/33 6095/157 309/30 1371/537 7.76/78 0.263/1160 11 1007/7 8248/58 3799/264 8.33/36 6.655/828 4.75/107 <0.11/1508 10 10014/14 6361/96 2299/445 6.94/58 3.036/1195 2.74/147 00/1877 9 8219/23 4480/179 1291/714 5.35/95 1.220/1698 1.32/153 00/2064 8 6429/45 2993/311 6.675/1130 2.43/123 0.6615/2272 0.561/178 00/2606 7 5142/81 18105/566 3.560/1698 1.32/152 0.319/2863 0.571/175 00/2724 6 3356/166 11109/979 1.947/2455 0.591/170 0.155/3280 0.601/166 00/2391 5 2274/333 6.1101/1649 1.137/3307 1.12/178 0.155/3296 00/114 00/1595 4 1392/674 3.594/2663 0.7731/4017 0.651/153 <0.11/2639 00/44 00/634 3 8.8119/1357 2.080/4050 0.4920/4041 00/76 00/1303 2 5.3144/2720 1.370/5434 0.267/2672 1 3.4185/5488 0.6837/5442

If you'd prefer an image of this table for whatever reason, here's an .png file. This will always lead to a current version of the chart.

Here's a couple of notes about the unlikely numbers in the table.

• The only person to miss the Top 8 with a 12-3-1 record (37 match points) was Kenny Oberg in Amsterdam 2010. He finished a distant ninth on breakers; Kai Budde was eighth.
• There was also one person to miss the Top 8 from 12-2: Francesco Cipollesci at Pro Tour Nagoya 2011. Sorry, Francesco...
• In seven of the 28 tournaments, nobody on 36 match points made the top 8. PT Kaladesh was the last time the door was closed on them. Twice three 12-4s made it (PTRTR, PTEMN).
• The 16-0 box belongs to LSV, but do you remember who the other person was to have a chance at matching him? That would be Stanislav Cifka at PT Return to Ravnica. Kelvin Chew beat him in the final round to relegate him to the 15-1 box. I was surprised that there was also only one inhabitant of the 14-2 box; I guess if you don't have a good reason to dream-crush someone then at there's no need to play the last round out with a 14-1 or 13-2 record. The lone person to achieve that record was Chris Fennell at PT Amonkhet, who (I believe) played the last round for team series reasons as he was paired against Musashi's Ken Yukuhiro.
• I find it pretty amazing that nobody has flamed out from 10-0 yet. I guess it's difficult to make it to 10-0, as there's often only one or two undefeated players after day one to begin with. Plus your tiebreakers at 10-0 will be good enough to make it in at 12-4 when someone on 36 points is admitted. But you can go 1-5 or 0-6 from that position and miss, can't you? My guess is that the 100 in that box should actually be in the 90s somewhere, and over time it will decrease a bit.
• One person has come back from 1-3 to make the top 8: Alexander Hayne at PT Avacyn Restored. It was... a miracle. (Sorry, had to.)
• The only recovery from 1-2-1 was Eduardo Sajgalik at PT Return to Ravnica.
• While I'm sure many people have run off five straight constructed wins from 7-4, the only one of those to wind up 12-4 and make the top 8 was Noah Swartz at PT San Jose 2010. It helped that Noah started 7-0, so his tiebreakers were as good as they could have been.
• Noah is also one of the answers to the following trivia question: which players have made top 8 of a PT despite going 0-3 in a draft in that PT? There are four total; the other three people to do this are Brian Kibler (PT Austin 2009), Naoki Nakada (PT Paris 2011), and Jiachen Tao (PT Oath of the Gatewatch). They all 0-3'd the draft to start day two. Note that the 0-3 box in the table is a flat zero, but that doesn't tell the whole story about drafts on day one because the draft used to be rounds 6-8. Still, it is true that no one has ever 0-3'd their day one draft and made top 8.
• Last one for now: the seven people to come back from 0-2 are, in chronological order, You're not out until you're out! (Or apparently until you're 0-3.)

 2017 July 21 (Adam) 2008 and 2009 were integrated into the site yesterday. This update added 48 tournaments and around 140,000 matches. The site is big: 363 tournaments and almost 1.7 million matches in total now. The curating process is getting faster, though I expect that the scraping process will compensate by getting more difficult. So far I've been able to recover every round except for GP Costa Rica 2012, round 2. I bet in the next two years we'll come across a GP whose data is bad enough that we'll have to let a few rounds go. Just in time for PT Kyoto 2017 I've added the previous time the Tour has stopped in Kansai. So it's time for a pop quiz: Who won PT Kyoto 2009? I knew going into the project that there are people in different parts of the world that have the same name, but I didn't appreciate the problem of two people being in the same part of the world, ten years apart, with the same name. That is, until I had to try to reconcile results from 2008 with results from 2017. I'm doing my best, but stuff slips through the cracks. Your help in correcting the data is always dearly appreciated.

Since we just had two additions to this table in the last couple of weeks, it feels like a good time to post it here. This is a list of everyone, since 2008, who has won a GP without losing a match. No one has ever actually won every match they were paired for, though this is largely a matter of deciding how you count intentional draws.

nameW-L-D-IDevent
Tsumura, Kenji12-0-1-1gpkl06
Saito, Tomoharu11-0-1-2gpsin09
Vidugiris, Gaudenis12-0-1-2gpden11
Shiels, David11-0-3-1gpdal11
Parker, Richard16-0-0-1gplil12
Duke, Reid13-0-0-2gpnas12
Darras, Alexandre12-0-1-2gpman12
MacMurdo, Walker12-0-3-1gpauc12
Lanthier, Dan12-0-2-2gpvan15
Lipp, Scott13-0-1-2gpsyd16
Saporito, Thiago13-0-1-1gplv17-lim
Locke, Steve16-0-0-2gpmin17

We have raw 2008-09 data now; it's going to take a couple of weeks to curate it, but it's good enough to decide whether the winner lost at some point during the tournament or not. The table only lists the record in rounds played, i.e., byes are ignored. This is why the win totals might look low at first blush. Richard Parker only had one bye in GP Lille, so he got an extra couple of matches in at the beginning. This is an insane accomplishment, but it is actually not the record for wins in a single GP: that distinction goes to Brock Parker, who won GP Pittsburgh 2013 with zero byes, going 17-2-0-0 in the event. He was helped by the fact that there was a tenth round of sealed deck played on day one. Given that all current Grand Prix (and almost all older ones) involve fifteen or fewer rounds of Swiss, it seems very likely that no one else has managed 17 wins in one GP.

I believe that GP Montreal last weekend marks the fifteenth occurrence since 1/1/2010 that tables 1, 2, 3, and 4 at an event have chosen to ID. In three of those, it was a clean cut to the top eight: the eight people who drew were guaranteed to have more match points than the rest of the field. In the other twelve instances it came down to tiebreakers and/or the results of pairdowns. Here's how the eight IDers finished, by event.

eventpositions
gpoma17clean cut
gpmon171 2 3 4 5 6 7 9
gpvan151 2 3 4 5 6 7 8
gpman152 3 4 6 7 8 9 10
gpba141 2 3 4 5 6 7 8
gpabq132 3 4 5 6 7 8 9
gpkc131 2 3 4 5 6 7 9
ptdgmclean cut
gpsin131 2 3 4 5 7 8 9
gptai12clean cut
gpmc121 2 3 4 5 6 7 9
gphir11clean cut
gpdal112 3 4 5 6 7 8 9
ptams101 2 3 4 5 6 7 8
gpbru102 3 4 5 6 7 8 9
gpkl101 2 3 5 6 7 8 10
pthon09clean cut
gpbrus081 4 5 6 7 8 9 10
gpnj042 3 4 5 6 7 8 9
gpfuk021 2 3 4 5 6 7 8

Only in three(!) of the twelve tournaments where there was suspense did the eight IDers make the top eight. Be careful when drawing, people!

 2017 April 28 (Adam) We've upgraded some of the pages at the left. The leaders page has been expanded and is more sortable than before. You can look arrange the table by rating as usual, or now by record or winning percentage by tournament type. It also expands to the top 150; this change needed to happen at some point because the people near the top are all very close to each other, so it was pretty capricious who happened to appear when we only had a top thirty. It felt like an overload to see the whole top 150 by default on mobile though, so we kept the shorter option too. At the moment, 65 PT wins will get you 150th place. I wonder how far up that number will go once we're done adding tournaments to the beginning of the data set. To make room for this, we've moved the histogram/percentiles table to a new stats hub. There are a couple of other widgets there, linking to pages that document some of the ancillary things we've blogged or tweeted about. You can find the table of unintentional draw streaks there, for instance. The plan is to add more of these as we go along. I want to add a Weeks at #1 page that lists who's been the highest-rated player and for how many weeks, for instance. If you have other suggestions for stats we can track, we'd love to hear them!

 2016 December 20 (Adam) I said I wasn't going to scrape for a little while, but I do have some data to share today. After a comment Reid made on Twitter, I got curious to figure out how long someone has gone without an unintentional draw. I've learned that three four years is too short a scale for this, as there are several people who don't have even one in our database yet. But, for now, here are the candidates. I'll come back to this question each time I tack a year onto the back of the data set. This comes with the usual caveat that sometimes it's not super clear when a draw in a late round is or isn't intentional, but I've done the best I could. The streaks here do not include unintentional draws. By default, all streaks of 250 275 matches or more, active or not, are displayed. If you'd prefer, you can filter the table to see only active streaks. After you've done that you can restore the default view. Update (5/1/17): This table now has its own page accessible via the stats hub and is updated with each tournament.

 2016 December 15 (Adam) A couple of days ago I added Grand Prix Milwaukee 2016 to the site, meaning that all of 2014, 2015, and 2016 is here now. There's more than 820,000 matches in the database spanning 149 tournaments. Grading finals and completing a bit of math research is above scraping more tournaments for me at the moment, but I'll get in one more batch of corrections in a couple of days. My goal is to get to PT Theros (10 more tournaments to go) by the end of the year.

 2016 November 12 (Adam) The World Magic Cup begins later this week! I've spent the last couple of days looking at the list of participants, trying to match them to entries in our database. Here's the fruits of that tree. Some teams have players that haven't played in a Grand Prix or Pro Tour in the last three years, so they've been assigned the starting value of 1500 (colored gray in the table). You can sort this table by average, median, top rated player, lowest rated player, or alphabetically. If you'd prefer to limit the sort to only the teams with four rated players, here's the average and median sort with those teams filtered to the top. Mouseover a rating to see the name associated with it, or click on a country to make the names appear. The bold rating is the team captain. Best of luck to all participating teams!

 2016 November 05 (Adam) I've been keeping the database updated with new GPs as they've occured, and with each update I've managed to get one or two more old ones into the system. Today's update includes the just-finished GP Dallas (congratulations to Kevin Mackie and Skred!) as well as the next Pro Tour back in time, Born of the Gods. (That means it's time for another pop quiz: who won PTBNG?) This last month has seen the rise and fall of Shota Yasooka — he hit a peak of 2263 by winning the PT, then spent a hundred points going 2-4 in Malaysia. At the moment no one is above the "LSV line". We've gotten a number of good feature requests from the community in the last couple of weeks. I don't think we'll have time to add much to the site itself until winter break, but I look forward to implementing some of them. Until then I'll continue trying to bolster the database. We have six more GPs left to reach the beginning of 2014 and seventeen to go to reach the previous Pro Tour. Sounds like a lot, but we're well over a hundred tournaments now, so seventeen more doesn't sound that daunting any more.

 2016 October 12 (Adam) New additions today: GP Beijing 2014, GP Atlanta 2016, and GP London 2016. Lukas Blohon went 2-3-1 drafting in London and it cost him eighty points! The perils of a 2300+ ranking... Best of luck to everyone participating in the Pro Tour!

 2016 September 25 (Adam) I added five more GPs to the beginning of the timeline today, besides tidying up some high-probability duplicate entries. Today's innovation was to check every instance of a last name being shared by exactly two entries, to search for nicknames and typos in the first name. I'll do the reverse when I update next and look for mistakes in last names. Eight more GPs to go before I reach the next Pro Tour...

 2016 September 13 (Adam) Refreshed the database again. Highlights from this update: Five new tournaments were added: Grands Prix Chicago, Moscow, Manchester, and Atlanta 2014, and Pro Tour Journey into Nyx. (Pop quiz: who won PTJOU?) Some things I was forced to ponder: why was Moscow only 14 rounds? Why are the two halves of round five of GP Chicago the same (including the standings, to ensure maximum difficulty in reconstructing the results)? Why have a GP the week after the Pro Tour in the same city? I discovered that round 12 and round 13 of GP Porto Alegre 2015 were copies of each other. I reconstructed round 13 and we got this fixed. If you happen to notice a player playing the same opponent in two rounds in a row, it might be another instance of this mistake. I'm pretty sure that there aren't any more instances of it in the database at the moment, but this problem may come up again in the future. I merged and split a few people who were/weren't the same thanks to tips we got from the community. Thanks guys! Keep it coming. I also fixed some Mike/Michael, Andy/Andrew, Dave/David, and Tony/Anthony mistakes. I didn't really know where to look to see if people from Moscow were the same as other people — I didn't realize when we started the project that we'd need to be knowledgeable in Russian transliteration conventions. I'm sure there are some entries that need to be combined/separated that so far have gone undetected. Before I add more tournaments I'm going to work on cleaning up what's here a bit. We're now rating unintentional draws. For information on what this entails, check out the entry in the FAQ. I was originally hesitant to do this because I was afraid that I wouldn't be able to tell the difference between intentional draws and unintentional ones. But sometimes it's not hard to tell if it's intentional: — Owen Turtenwald (@OwenTweetenwald) October 17, 2015 Most intentional draws are reported as "Draw D-D", "Draw 0-0-3", or "Draw 0-0-0", though sometimes 0-0-1 or just 0-0. My general rule of thumb was to interpret any of these notations as representing an intentional draw if I could find any remotely logical reason why the players were incentivized to draw. This includes making top 8, nabbing an extra Pro Point at a PT, or even the rare round nine matchup of 6-0-2s who could ID to make day two. Any sort of 0-0-x draw that ended with one or both players in the money I treated as an ID. Now there were some random apparent IDs in early rounds scattered throughout the tournaments. Sometimes it's easy to confirm that these were unintentional: This is great :) I don't think Jasper's opponent read @karsten_frank article #GPLille pic.twitter.com/bHeICzw5Xz— LukasBlohon (@LukasBlohon) August 28, 2016 But I'm afraid some of these were people convincing their opponent to skip a round and get lunch. There are 17119 draws in the database. Probably about 1000 of them should be intentional, and I've marked about 800 of them. My guess is that about 200 intentional draws are inaccurate. Draws don't have a big effect on the rating, either, so this isn't something to lose sleep over. But doing it the other way, with no draws rated, about 16000 matches were being tallied incorrectly. If the goal is to minimize wrong results, this does represent progress. Note that for rounds that I had to reconstruct, like Porto Alegre round 13 from pont #2 above, all results are styled as 0-0, whether it's a win, loss, or draw. So if you want to point out a match to me that's a potential ID, you need more evidence than what the site is displaying.

 2016 September 05 (Adam) The numbers in the previous analysis may be slightly off now, because I've just added four GPs to the beginning of the timeline: Grands Prix Boston-Worcester, Taipei, Milan, and Washington DC 2014. I also corrected about a hundred errors, some of which came from the community (thanks everyone! Keep it coming!) and some of which I stumbled across on my own as I was adding new people in. My goal is to get back to Pro Tour Journey Into Nyx for the update next week, so four more GPs to go.

Here's an update to the table from the last enty, with the tournament complete. The deltas are all based on the positions entering the tournament. Congratulations to the world champion, Brian Braun-Duin!

 rank Δ rank rating name record - 1 2332 Lukas Blohon 9-5 ▲ 5 2 2288 Brian Braun-Duin 12-3-1 - 3 2211 Luis Scott-Vargas 9-5 ▲ 7 4 2170 Oliver Tiu 9-5-1 ▲ 2 6 2139 Seth Manfield 8-6 ▲ 22 7 2130 Marcio Carvalho 10-5-1 ▼ 4 8 2128 Mike Sigrist 7-7 ▼ 7 9 2123 Owen Turtenwald 6-8 ▼ 8 14 2092 Yuuya Watanabe 6-8 ▲ 28 16 2077 Shota Yasooka 9-6 ▼ 1 20 2070 Reid Duke 7-7 ▲ 1 22 2048 Brad Nelson 7-7 ▲ 1 24 2045 Joel Larsson 7-7 ▼ 14 28 2031 Ondrej Strasky 6-8 ▲ 12 29 2022 Steve Rubin 7-6-1 ▲ 4 42 1997 Paulo Vitor Damo da Rosa 7-7 ▼ 25 45 1988 Sam Pardee 5-9 ▲ 69 53 1979 Jiachen Tao 8-6 ▲ 142 83 1945 Thiago Saporito 8-6 ▼ 153 174 1878 Martin Muller 3-11 ▼ 103 372 1810 Kazuyuki Takimura 5-9 ▼ 173 380 1807 Andrea Mengucci 5-9 ▼ 205 424 1795 Ryoichi Tamada 4-10 ▼ 12 1067 1716 Niels Noorlander 5-9

Not shockingly the rating and ranking of the people who went 7-7 is very similar to their starting values. For instance, Reid's 7-7 changed his rating by 1.97 points (from 2068.35 to 2070.32) and his ranking from #19 to #20. Less obviously, Elo was not impressed by Blohon going 9-5. Given his schedule (i.e., that he played a player ranked 2066, then a player rated 2026, etc.), a correctly-rated 2320 should have won about 8.65 of their matches. So this was a slight overperformance in the system's eyes, hence a slight improvement to his record — he ascended from 2320 to 2332. Similarly, Niels performed shockingly close to expectation: his rating moved by only 0.75! He went from 1716.30 to 1715.65, a change small enough to be swallowed up by rounding. Again, the bulge of people in the low 1700s still meant that he was passed by twleve people.

Since the field is small I was able to add the results from today's matches into the system. The 6-1 records from Brian Braun-Duin and Marcio Carvalho were worth close to a hundred Elo points each! These big swings are possible because each player in the tournament has a comparably stratospheric rating, so each match is worth a lot to each participant. (In contrast, a typical Grand Prix for a player with a 2100 rating is kind of like a college football schedule: a smattering of titanic clashes interspersed with the Elo equivalent of FBS teams.) Here's a look at how each player's ranking has changed. I'll update again on Saturday night after the back half of the Swiss rounds is in the books.

 rank Δ rank rating name record - 1 2336 Lukas Blohon 5-2 ▲ 5 2 2226 Brian Braun-Duin 6-1 ▲ 1 3 2197 Mike Sigrist 5-2 ▼ 1 4 2175 Luis Scott-Vargas 4-3 ▲ 3 5 2166 Seth Manfield 5-2 ▼ 5 7 2156 Owen Turtenwald 3-4 ▼ 2 8 2122 Yuuya Watanabe 3-4 ▲ 20 9 2121 Marcio Carvalho 6-1 ▼ 1 12 2108 Oliver Tiu 3-3-1 ▲ 4 15 2097 Reid Duke 4-3 ▼ 6 20 2057 Ondrej Strasky 3-4 ▼ 2 22 2047 Sam Pardee 3-4 ▼ 4 27 2029 Brad Nelson 3-4 ▲ 16 28 2025 Shota Yasooka 4-3 ▲ 17 29 2016 Paulo Vitor Damo da Rosa 4-3 ▼ 1 42 1996 Steve Rubin 3-3-1 ▼ 18 43 1993 Joel Larsson 2-5 ▼ 58 79 1947 Martin Muller 1-6 ▲ 33 89 1937 Jiachen Tao 4-3 ▲ 93 132 1902 Thiago Saporito 4-3 ▼ 6 213 1858 Andrea Mengucci 3-4 ▼ 82 301 1828 Ryoichi Tamada 2-5 ▼ 96 365 1813 Kazuyuki Takimura 2-5 ▼ 962 2017 1670 Niels Noorlander 1-6

5-2 was about par for the course for Lukas, who maintains his incredible peak. It is certainly unsustainable, but I'm captivated to see how long he can continue to hold such a high rating. As Rebecca said in the post below, we shouldn't look at Lukas's high rating as an indication that he's that much more likely to win the tournament from this position. What Elo is picking up on is that his recent results (112-38 in his last 150 matches!) are consistent with the results of a real juggernaut.

And to be fair to Niels, Elo didn't punish him too much for his 1-6 day — it only cost him about 46 points off his rating. Given the Elo ratings of the people he played, the ratings only expect that a 1716 player would manage 2.4 match wins. For comparison, a 1500 player would only expect about 1.76, and going 1-6 against that slate would only cost the 1500 player about 27 points. (These 25-to-50 point adjustments very small. Remember our rankings are "elongated," so that a 25-point difference only corresponds to around 1.25% of win expectency in any given match.) The fact that Niels has a large ranking delta just has to do with the fact that there are way more high-1600s players than there are players in the 1800s and above, so he fell past a big pack of people.

 2016 August 31 (Rebecca) A quick note/musing on the Worlds simulation Adam posted about below: With no disrespect to Lukas Blohon, it's obvious that in no realistic model is he 16% to win the tournament. So why did the simulation come out that way, and does it mean that the expected win percentages that our model is assuming are pretty far off? Well, the short answer is no. When we look at all matches in our dataset between "veteran" players (three events or ten matches played), players whose rating is 195-205 points higher than their opponent have a 58.89% win percentage, which is very close to what the model expects. The problem is that at any given moment in time, the rating of any player who has just won a tournament or had a couple of deep runs is inflated a bit above their equilibrium point. When we take a snapshot of the current ratings, and then run a simulation forward only sixteen matches, the effect of that inflation is exaggerated. There would probably be a more "polls-plus" way to try to simulate the outcome of a given tournament, adjusting for recent big swings in rating, but it would take (a) a larger dataset (ours is still relatively small for the moment), and (b) more time. For now, if you're using our site to help make your MTG Worlds fantasy draft picks (as I have been!), take the actual match/tournament history data seriously, and the win probabilities as entertainment. ^_^

The World Championship starts later this week. Here's links to the 24 people who will be competing for the trophy. The field has seven of the top eight (sorry Scott Lipp!) and half of the top thirty.

 rank rating name win% top 4% swiss 1 2320 Lukas Blohon 16.55% 46.38% 8.747 2 2200 Owen Turtenwald 8.81% 30.63% 8.000 3 2172 Luis Scott-Vargas 7.50% 27.47% 7.833 4 2166 Mike Sigrist 7.25% 26.77% 7.796 6 2150 Yuuya Watanabe 6.54% 24.98% 7.699 7 2148 Brian Braun-Duin 6.44% 24.72% 7.685 8 2136 Seth Manfield 5.98% 23.48% 7.610 11 2113 Oliver Tiu 5.15% 21.17% 7.475 14 2093 Ondrej Strasky 4.54% 19.20% 7.351 19 2068 Reid Duke 3.82% 17.06% 7.208 20 2066 Sam Pardee 3.76% 16.87% 7.193 21 2050 Martin Muller 3.38% 15.46% 7.095 23 2049 Brad Nelson 3.37% 15.45% 7.090 25 2045 Joel Larsson 3.22% 15.06% 7.063 29 2028 Marcio Carvalho 2.90% 13.85% 6.964 41 1999 Steve Rubin 2.32% 11.72% 6.789 44 1992 Shota Yasooka 2.22% 11.29% 6.748 46 1990 Paulo Vitor Damo da Rosa 2.16% 11.16% 6.735 122 1911 Jiachen Tao 1.14% 6.93% 6.259 207 1862 Andrea Mengucci 0.75% 4.97% 5.958 219 1856 Ryoichi Tamada 0.70% 4.74% 5.922 225 1853 Thiago Saporito 0.68% 4.67% 5.903 269 1839 Kazuyuki Takimura 0.61% 4.24% 5.817 1055 1716 Niels Noorlander 0.19% 1.68% 5.061

The right three columns were created by simulating the tournament one million times. They show the share of times that player wins the tournament or makes the top four, as well as his average number of wins in the Swiss portion of the tournament.

I can't claim to be 100% sure that I got the pairings algorithm correct, but I did my best. I assumed that the draft pods would be between the people in positions 1-8, 9-16, and 17-24 respectively, that the drafts are seated randomly and that pairings in the drafts are based on seat, and that you can't play someone in constructed that you've already played in constructed (disregarding format). The latter two things might not be totally accurate (I'm guessing that #1 and #2 are encouraged to play in the first round of the second draft?), but I think it's probably close enough for the numbers to be in the ballpark. The outcome of each match is decided by flipping a weighted coin whose weight is determined by the Elo win expectency scheme. The ratings are updated after each round, so Lukas Blohon doesn't necessarily enjoy a stacked deck throughout the simulation.

Mainly I was curious as to what Lukas's 120-point lead on the field in Elo translated to on a whole-tournament scale instead of an individual-match scale. It's rather sizeable. These numbers do highlight the limitations of Elo: I don't think that it's possible that Niels Noorlander's odds are really like one in 520 as the table suggests. In FiveThirtyEight parlance, this is much more of a now-cast than polls-plus. But if you want some food for thought while you fill out a fantasy draft, here you go.

Good luck to everyone participating!

 2016 August 26 (Adam) As part of a huge update fixing many typos and other small inconsistencies with the dataset, I went through David Williams's twitter feed. Big-name players who tweet about their tournaments are invaluable. Keep it up, guys. As part of this update I investigated: (a) every pair of names that were off by one character, (b) every person who played in two GPs that occured simultaneously, and (c) every entry that had a parenthetical nickname. This unearthed hundreds of pairs that I felt sure enough to combine, and some that needed to be split. As with any combining effort, there's going to be some false positives (entries that should not have been merged that were) and false negatives (names I should have merged but elected not to). Many of the false positives probably came from how I aggressively merged all "Yusuke" and "Ryusuke"-s into a corresponding "Yuusuke" and "Ryuusuke". But I think overall I improved the quality of the database by a sizeable margin. The update removed 486 duplicate entries.