Project “W” Second Jump Results

“Knowledge is the death of research” – Walther Hermann Nernst, Chemist

12.04.yc118 J163408 < E-C00264 < Region E-R00026

Rather than lead you through the data analysis of Phase II, let’s cut to the chase and reveal the results.

The null hypothesis: Based on region, known wormhole types are randomly connecting to other regions of space within the known expected distribution by type to the destination region using a significance level of 0.05.

Conclusion based on Phase II data: Since the p-values are greater than the significance level of 0.05, we accept the null hypothesis. The observed distribution is from the same population as the expected distribution.

TLDR: Known wormhole connections are equally random.

Phase II

Project Coordinator: Katia Sae
Project Liaison: Merkato Cesaille
Technical Lead: David Louis
Project Specialist: Ashlar Maidstone, Soul Darkshade
Research Team: Forcha Alendare, Alek Azam, Lucas Ballard, Triffton Ambraelle, enkidu nagata, Sanibel, Lucas Ballard, Mushroom Greene, Theo Fugger, Caleb Wolfram, Akatsuki Hikage, Pileto, Mako Koskanaiken, Earthling Jaer, Vladimir Gengodov, Gorgan Fullsail, Caille Sinclair, Jen Outamon

Read on if you’re interested in some of the details. I’ll specifically target our exceptions from Phase I.

Presentations

Check out these post for more information about Project “W”, how it came about, and the Phase I results.

Get on with it!

From September to the end of November yc118 (2016), there were a total of 15,305 connections observed. From that data set, I used the known connection types for the analysis which gave me a total of 4,902 connections to analyze. Compared to the 300 connections from Phase I, we increased our sample size by a factor of 16.34. With this data set, we were able to meet the following conditions of the Chi-square Goodness of Fit test as follows:

  • Sampling method is simple random sampling. Our observed connections are equally likely to occur in our expected destination population (Regions). Passed.
  • Variable under study (connection type) is categorical (Regions). Passed.
  • The expected value of the number of sample connections in each level of the variable is at least 5. Passed.

From Phase I, our anomalies concerned High Sec, specifically the Genesis and Molden Heath regions, and Class 5 wormholes, specifically region E-R00024. Let’s compare them.

highsec

Phase I High Sec

highsecii

Phase II High Sec

As you can see, our High Sec observations by region went from a range of 0 to 9 to a range of our minimum of 5 to 120.

Class 5 Phase I

Class 5 Phase I

Class 5 Phase II

Class 5 Phase II

High Sec by Chi-sq Phase I

High Sec by Chi-sq Phase I

High Sec by Chi-sq Phase II

High Sec by Chi-sq Phase II

Class 5 by Chi-sq Phase I

Class 5 by Chi-sq Phase I

Class 5 by Chi-sq Phase II

Class 5 by Chi-sq Phase II

 

 

 

 

 

 

 

 

 

 

Our Class 5 observations by region went from a range of 0 to 7 to a range of 5 to 50.

 

 

 

 

 

 

 

 

 

 

 

 

 

Comparing our Chi-square ranking, you can see our anomalies normalized with the other regions with the additional data collected.

Conclusion

That’s really all there is to it. The research is a good lesson that conclusions can’t be drawn until all of the conditions of a given test are met. In this case, our Phase I data we didn’t have the minimum of 5 observations for each region of New Eden. Our Phase II data met that requirement and we were able to show that, at least for known wormhole connections, the destination region is equally random.

Where to from here?

I’m going to look at K162 connections and see if there are any abnormalities to be found there. My thought process is this: A K162 connection should randomly connect to anywhere, be it High Sec, Low Sec, Null Sec, or W-Space. The caveat is this, when reviewing the K162 connection, it will give you an indication of what type of space it leads to, just like our known wormhole connections did from our previous two analysis. I’m going to go with the assumption that until you look at the connection, it could lead anywhere. I’ll call this Phase III and use the same data set we just collected. Stay tuned…

Project “W” First Jump Results

“If we knew what it was we were doing, it would not be called research, would it?” – Albert Einstein, Theoretical Physicist

10.09.yc118 J144135 < D-C00202 < Region D-R00021

Link to the presentation

What is Project “W”

genesis-2I posted an observation I had made back in April yc118 (2016) that started off this research project that I titled Project “W”. There’s no rhyme or reason for the name, I just didn’t know what to call it. You can read more about that following this link. After my blog post, others came forward and said they’ve noticed similar things and offered suggestions as to what could be going on from there is something odd, to that’s the nature of randomness, and the way the brain works looking for patterns. I figured the only way to prove or disprove anything one way or the other would be to collect some data and do some analysis. So, Project “W” was born.

With the help of some of my Signal Cartel corp mates and friends, we spent about 3 months from April yc118 to June, collecting data while navigating wormhole connections. At first I had thought there may be some kind of lightyear limit between systems that could possible explain the oddity, but after Johnny Splunk reviewed the Thera data from the EvE-Scout site, he stated there didn’t seem to be a correlation. So, we proceeded with the data collection without a premise, just mainly interested in seeing if any data anomalies would present themselves.

The Project Team

Before we start the analysis of the data collected, I want to shout out to our Research Team. Special thanks to: Aiken Paru, Mirielle Asaki, Kobura Juraxxis, Mushroom Greene, Mynxee, Dr Zemph, Delaine De’Andre, Mark726, Saile Litestrider, Zecht Reddas, Forcha Alendare, Dorian Reu, Pileto, Jen Outamon, Mason Akiwa, Josca Aldent, Ashlar Maidstone, Stikkem Innagibblies, Dungeon Manager, Ozob Bozo, Andrew Chikatilo, Johnny Splunk.

Link to the presentation

Observed Connections and Doing the Analysis

A total of 663 connections were observed. Of those, 300 connections were via a known wormhole type which means we know what type of space and possible region was on the other side. This will become our dataset for this first pass on the analysis. Because of this measurable dataset, I choose to use the Chi-Square Goodness of Fit test.

The Chi-Square Goodness of Fit test is appropriate if the following conditions are met:

  • Sampling method is simple random sampling. Our observed connections are equally likely to occur in our expected destination population (Regions). Passed.
  • Our variable under study (connection type) is categorical (Regions). Passed.
  • The expected value of the number of sample connections in each level (by Region) of the variable is at least 5. Failed. More data is necessary to fulfill this requirement, however, we’ll still take a look at what we do have, if nothing else, it’s a place to start.

The Special W-Space Class & Regions

As well as excluding the 363 exit wormhole connections and connections where the type wasn’t recorded, I also excluded Class 12 (Thera), Class 13 (Frigate sized accessible systems), and Classes 14 through 18 (Drifter wormholes) because each one are in their own region and therefore, when you find one of those connections, it’s a 100% chance you are landing in that region of space.

Determining the Expected

By knowing the signature type, we know the type of space and possible region where the destination is likely to be. For example, a wormhole connection with a type of E004 will connect to a Class 1 wormhole. We know Class 1 wormholes constitute Regions 1, 2, 3, and A-R00001. We know how many systems are in each region and assuming our hypothesis that your chances of exiting in each region is equally distributed, we can compute the probability. For example, from our chart, you can see when finding a connection that leads to a Class 1 wormhole, there’s a 37.2% chance of exiting in Region 1, 42.7% in Region 2, and so on.

class1expected

The following two slides you can see the K-Space and W-Space expected distributions by region.

Class 1 Chi-Square Goodness of Fit Test

class1found

Class 1 results

Let’s get to the analysis. I started with Class 1. Above you saw our expected distribution. To the right, you see that we found a total of 36 connections leading to Class 1 wormholes. If we take that total and apply our expected distribution against it, you see that for Region 1, we found 13 and expected to find 13.37. Region 2 we found 15 and expected 15.39, and so on. Running the data through the Chi-Square calculation we measure the difference between the found and expected, we sum up those values from each region, then compute the p-value or probability which is basically the likelihood that our observation data set comes from the same population as our expected data set. In this case, there’s a 99% probability we have a match.

Since the p-value of 0.99 is greater than the significance level of 0.05 (our measuring stick to find the exceptions), we accept the null hypothesis. The TLDR is connections that lead to Class 1 wormhole’s are equally random to the destination systems. In other words, it appears to be randomly determined.

class1

Please note, however, that we fail to meet one of the 3 conditions for this test to be valid, we only have 1 observation for region A-R00001 and we need a minimum of 5. In this case, the p-value is so strong and the observations are close overall, I feel more data gathering will only strengthen this result.

Seeing this I was both elated and disappointed. Fantastic! I thought, the test works and wormhole space connections are truly random… well dern, I was hoping to see the hypothesis fail, meaning there’s favoritism between regions of space, non-randomness if you will. Well, we have this data, let’s keep looking.

What about the other wormhole classes and known space…

The next two slides you can see the test results for other wormhole and known space regions. The p-value’s vary from 0.17 (which still passes), 0.33, up to 0.89. You can also see we’re missing a fair number of observations in various regions again reiterating we need more data. It’s still interesting to see that there does appear to be enough data to begin seeing connections appear to be random. As I said before, more data is likely to strengthen the results.

Who’s missing… ?

Did you notice there were two areas of space that were missing from the previous two slides? High Sec space and Class 5 wormholes. Take a look at the next slide. They both failed and not borderline either, they failed by a wide margin, High Sec with a p-value of 0.0000000005 and Class 5’s with 0.0003. Since the p-values are less than the significance level of 0.05, we reject the null hypothesis. The TLDR, connections to High Sec and Class 5 wormholes are not equally distributed. It appears to not be random.

class5

highsec

Keep in mind, not enough data to confirm or deny these results, but isn’t it strange that it seems we have enough data for all regions of space to pass them except for these two? We do have observations from almost all of their respective regions, not the minimum, but still a fair sampling.

Wormhole Classes and Known Space by Chi-square ranking

So, who are our offenders? One region is clear as it jumps off the chart, Genesis, but are there others? In order to find out, we’ll sort our result set by their Chi-Square computation. For our class 5’s it was region E-R00024, the shattered wormholes for that class. The next slide shows us that it was Genesis and Molden Heath from High Sec.

class5bychisq

highsecbychisq

What does it mean?

  • Using a connection that leads to High Sec, the expected probability of landing in Genesis was 3%. Based on observed data, Genesis was 20%. (9 out of 45).
  • Using a connection that leads to High Sec, the expected probability of landing in Molden Heath was 1%. Based on observed data, Molden Heath was 9%. (4 out of 45).
  • Together, both Genesis and Molden Heath accounted for 29% of jumps to High Sec.
  • Using a connection that leads to Class 5 wormhole space, the expected probability of landing in E-R00024 was 4%. Based on observed data, E-R00024 was 19%. (4 out of 21).

From a couple of chat sessions I had with my fellow corpmates when I presented these findings, the speculation was that Genesis is a favored region for Signal Cartel, because one of our offices is located in the Zoohen system. Because we don’t have enough data, it is possible this is at play. But what about Molden Heath and E-R00024? What’s special about them? Does that place doubt on the favoritism thoughts of the Genesis region because of Zoohen?

If not Signal Cartel bias, then what? We know Genesis is the home region for the EvE Gate. We know E-R00024 are the shattered wormholes for Class 5’s, but other regions have shattered wormholes. I did find out there is one unique system in the Class 5 shattered’s, J013146, a C5 Magnetar system with 7 shattered planets where we can find sleepers and Talocan Static Gates in the epicenter. Was this system perhaps where the cascade failure began? (Seems I need to find a historian). Is there a connection to the Eve Gate? But then what about Molden Heath? Is there something unique, different, or some observer favoritism going on?

Raw Data for the Anomalies

On this slide I wanted to present the data for the failed regions. I highlighted some commonalities among the entries, but it’s easy to see not enough data to draw any conclusions.

Conclusions

  • To positively confirm these results, we need to meet the minimum conditions for the Chi-Square Goodness of Fit test of at least 5 observations per region in High Sec and Class 5 wormholes. More data is needed.
  • The p-value results for both High Sec and Class 5 are way out of sync with the reminder of the findings, it seems unlikely the rejected result of the null hypothesis would be reversed with more data, but it is possible.
  • Even allowing for the minimum conditions of the Chi-Square test not being met, there seems to be enough data to say something odd seems to be going on Genesis, Molden Heath, and E-R00024.
  • If we assume that more data will positively confirm these results, then the majority of known wormhole type connections are equally random across their respective destinations, with the exception of our 3 mysterious regions.
  • We know there’s something special about the Genesis and E-R00024 regions, but about Molden Heath?

Final thoughts

Even though we don’t have enough data (have I said that enough 😉 ) to confirm or deny these findings, I find it odd that it appears we have enough to see the trend that for the most part, connections to other regions are random, with the exception of Genesis, Molden Heath, and E-R00024. It could very well be favoritism for Genesis, but what of the other two regions? If nothing else, this study has only added to the mystery of wormhole connections and ask more questions than what we started with. I think further observations, data gathering, and analysis are warranted. How, without any bias or favoritism going on, will be the challenge.

Links

  • W-Space – Why you not random? My blog post that really started Project “W”.
  • Wormhole Type Database – a list of known wormhole connections and where they lead.
  • Database of New Eden Systems – All K-Space and W-Space systems and their information.
  • Project “W” Phase I Data – The raw data cross referenced with the above databases. Open to anyone who wishes to do their own analysis, confirm my results, or do your own test. I’m open and welcome anyone to do your own research with this data, it’s not going to bother me. All I ask is give Project “W” credit for the data gathered.
  • Signal Cartel – Home of EvE Online’s premier exploration corp.

molden-heath-2

W-Space – Why you not random?

“Adventure is a state of mind and spirit.” – Jacqueline Cochran, American Aviation Pioneer

10.4.yc118 J233630 < Constellation 262 < Region 26

J112241

J112241

When I began my exploration of wormhole space just a little over three months ago, I had decided to base out of Thera. I felt it was a great place to start with random wormholes appearing daily. Being a member of Signal Cartel brings the benefit that most would be scouted already and taking advantage of that, I could quickly knockout several systems daily. Then I’d have to resort to scanning on my own which takes some time. With over 2,500 wormhole systems and assuming complete randomness with the connections, statistically, it should’ve been some time before I started finding systems that I’ve previously visited. At least that was the theory. But there’s something odd going on in wormhole space and it’s not the space affects that I’m talking about. The randomness of wormholes connections in each system doesn’t seem to be so random after all.

J102844 VI, Moon 1

J102844 VI, Moon 1

On my tenth day into exploring W-Space while based in Thera, I encountered my first duplicate system. I had only previously explored a total of 17 systems, so my chances of finding a duplicate system should have been less than 1%. Yes, I hear you and understand, less than 1% chance is still a chance, so with a raised eyebrow, I continued to base from Thera. But here’s the thing, as I proceeded to explore, duplicate systems kept coming up, beating the odds of finding them until finally at the end of March with only 10% of W-Space explored, my odds of seeing duplicates seemed closer to 30%. I decided to forgo basing in Thera and have been wandering ever since.

Now with my supposedly random wandering it gets more interesting. This last week, just the last few days really, has convinced me of the not so randomness of W-Space, that there’s a pattern to the chaos. One day, I hit system after system that I had previous explored, until finally I found one I had not. It was around ten systems, making that day’s odds over 90% likely to find systems I’ve already been to. What? How? It gets more interesting. The next day, I found system after system I had NOT been to yet, making that day’s odds over 90% likely to find systems I’ve NOT already been to. Very odd.

J115909

J115909

I don’t have an answer, so what am I proposing? I believe wormhole systems cluster together and connect more often than not to the same systems over and over again. Granted, I believe the clusters are rather large groups, maybe upwards to 600 or even 700 systems or so, but based on my experience so far, it would fit the odds I’ve been experiencing.

I’d be curious to hear from folks that actually have lived in a single wormhole system for a long period of time. Have you noticed or have you seen yourself connecting to the same systems over a period of several months? It’d be difficult to prove and take a lot of observation from multiple systems, but it is interesting never the less.

Fly Clever!
Katia

UPDATE: After some discussion on Twitter and further reflection, the more I’m beginning to think there may be a light year limit between systems and their ability to connect to each other via wormholes regardless if they’re K-Space or W-Space.

J170122

J170122