Nav-Appaiya
10/23/2015 - 7:24 PM

Datasets for Graph Hack.

Datasets for Graph Hack.

Datasets for Graph Hack 2014.

All stores are Neo4j 2.1.3.

Transportation

Flights

What is related, and how?

Flight  ORIGIN        Airport
Flight  DESTINATION   Airport
Flight  CARRIER       Carrier
Flight  AIRCRAFT      Aircraft
Flight  DELAYED_BY    Reason
Flight  CANCELLED_BY  Reason
Airport IN_CITY       City
City    IN_STATE      State
Flight  DIVERTED_TO   Airport

Which cities have more than 1 airport?

Query

MATCH (:Airport)-[:IN_CITY]->(c:City)
WITH c, COUNT(*) AS Count
WHERE Count > 1
RETURN c.name AS City, Count
ORDER BY Count

Result

CityCount
WASHINGTON, DC2
HOUSTON, TX2
NEW YORK, NY2
CHICAGO, IL2

Which cities have more than 1 airport, and what are they?

Query

MATCH (a:Airport)-[:IN_CITY]->(c:City)
WITH c.name AS City, COLLECT(a.name) AS Airports
WHERE LENGTH(Airports) > 1
RETURN City, Airports
ORDER BY LENGTH(Airports)

Result

CityAirports
HOUSTON, TXGEORGE BUSH INTERCONTINENTAL/HOUSTON, WILLIAM P HOBBY
WASHINGTON, DCRONALD REAGAN WASHINGTON NATIONAL, WASHINGTON DULLES INTERNATIONAL
NEW YORK, NYJOHN F. KENNEDY INTERNATIONAL, LAGUARDIA
CHICAGO, ILCHICAGO O'HARE INTERNATIONAL, CHICAGO MIDWAY INTERNATIONAL

Of the flights that departed from SFO and were diverted, what are the top airports that they were diverted to?

Query

MATCH (f:Flight)-[:ORIGIN]->(:Airport {abbr:'SFO'}),
      (f)-[d:DIVERTED_TO]->(a:Airport)
WHERE d.step = 1
RETURN a.abbr + ' - ' + a.name AS Airport, COUNT(*) AS Count
ORDER BY Count DESC
LIMIT 10

Result

AirportCount
SFO - SAN FRANCISCO INTERNATIONAL13
TWF - JOSLIN FIELD - MAGIC VALLEY REGIONAL6
RDM - ROBERTS FIELD5
GJT - GRAND JUNCTION REGIONAL3
RDD - REDDING MUNICIPAL3
EUG - MAHLON SWEET FIELD2
DTW - DETROIT METRO WAYNE COUNTY2
LAX - LOS ANGELES INTERNATIONAL2
BOS - LOGAN INTERNATIONAL2
LMT - KLAMATH FALLS AIRPORT2

Fleets

What is related, and how?

Airline   BASED_IN  Country
Series    MODEL     Model
Aircraft  SERIES    Series
Aircraft  OWNED_BY  Airline

Which airlines own the most active Boeing 747s? How many do they own? In which countries are these airlines based?

Query

MATCH (a:Airline)-[:BASED_IN]->(c:Country),
      (craft:Aircraft)-[:OWNED_BY]->(a),
      (craft)-[:SERIES]->(:Series)-[:MODEL]->(m:Model)
WHERE craft.status = 'Active' AND m.name = 'Boeing 747'
WITH a, COUNT(m) AS count, c
ORDER BY count DESC
RETURN a.name AS Airline, c.name AS Country, count AS `Number of Boeing 747 Owned`
LIMIT 10

Result

AirlineCountryNumber of Boeing 747 Owned
British AirwaysUnited Kingdom48
Korean AirSouth Korea36
Cathay PacificHong-Kong33
LufthansaGermany32
China AirlinesTaiwan31
KLMNetherlands26
United AirlinesUSA24
Atlas AirUSA22
Transaero AirlinesRussia20
CargoluxLuxembourg19

Which Embraer models have the most series of planes? What are they?

Query

MATCH (s:Series)-[:MODEL]->(m:Model)
WHERE m.name =~ 'Embraer.*'
WITH m.name AS Model, COLLECT(s.name) AS Series
RETURN Model, Series
ORDER BY LENGTH(Series) DESC

Result

ModelSeries
Embraer 135/145145SM, 145RS, 145EW, 145MP, 145BJ, 135ER, 135EJ, 135LR, 135, 145MR, 145LI, 145XR, 145ER, 145EP, 145LR, 145LU, 145SA, 135BK, 145EU, 145EJ, 135BJ, 140LR
Embraer 190/195195STD, 190SLR, 190ECJ, 190BJ, 195LR, 190IGW, 195AR, 195SR, 190ER, 190, 195IGW, 190AR, 190LR, 190STD, 190SR
Embraer 170/175175SR, 175SU, 170SL, 170SE, 170STD, 175SD, 175LR, 170LR, 170SU, 175STD, 170ST, 175IGW, 170AR, 170SR
Embraer 120 Brasilia120ER, 120QC, 120, 120RT

Dallas/Forth Worth International Airport

What is related, and how?

Gate   IN_TERMINAL  Terminal
Place  IN_CATEGORY  Category
Place  AT_GATE      Gate

At which gates in terminal A can I find a Starbucks?

Query

MATCH (p:Place)-[:AT_GATE]->(g:Gate),
      (g)-[:IN_TERMINAL]->(t:Terminal)
WHERE p.name = 'Starbucks Coffee' AND t.name = 'A'
RETURN g.gate AS Gate, t.name AS Terminal
ORDER BY Gate

Result

GateTerminal
29A
33A
37A

Where are places in terminal D in categories 'Bar' or 'Barbecue'?

Query

MATCH (p:Place)-[:IN_CATEGORY]->(c:Category)
WHERE c.name IN ['Bar', 'Barbecue']
WITH p
MATCH (p)-[:AT_GATE]->(g:Gate),
      (g)-[:IN_TERMINAL]->(t:Terminal)
WHERE t.name = 'D'
RETURN p.name AS Place, g.gate AS Gate
ORDER BY Gate

Result

PlaceGate
Railhead BBQ18
Tigin Irish Pub & Restaurant20
Cousin's Bar-B-Q28
Gas Monkey Bar & Grill31

Caltrain

What is related, and how?

Stop  IN        Zone
Stop  NEXT      Stop
Train STOPS_AT  Stop

Which consecutive stops are furthest apart?

Query

MATCH (s1:Stop)-[r:NEXT]->(s2:Stop)
WITH s1, s2, r.distance AS Distance
ORDER BY Distance DESC
LIMIT 1
RETURN s1.name AS Stop1, s2.name AS Stop2, Distance

Result

Stop1Stop2Distance
BLOSSOM HILLMORGAN HILL11.8

Which trains can I take home to San Mateo from San Francisco after the hackathon tonight?

Query

MATCH (t:Train)
WHERE t.direction = 'Southbound' AND t.type = 'Weekday'
WITH t

MATCH (t)-[leave:STOPS_AT]->(begin:Stop),
      (t)-[:STOPS_AT]->(end:Stop)
WHERE begin.name = 'SAN FRANCISCO' AND
      end.name = 'SAN MATEO'
WITH t, leave.hour*100 + leave.minute AS minutes, leave.hour_s AS Hour, leave.minute_s AS Minute
WHERE minutes > 2100 OR minutes < 100
RETURN t.id AS Train, Hour + ':' + Minute AS `Departure Time`

Result

TrainDeparture Time
19421:40
19622:40
19800:01

Music

Last.fm

What is related, and how?

Scrobble  PLAYED     Track
Scrobble  NEXT       Scrobble
User      SCROBBLED  Scrobble
Artist    MEMBER_OF  Genre
Track     SUNG_BY    Artist

Which artists do users 'nmwhite0131' and 'smooligans' mutually listen to most often?

Query

MATCH (u1:User {username:'nmwhite0131'}), (u2:User {username:'smooligans'})
WITH u1, u2
MATCH (u1)-[:SCROBBLED]->(:Scrobble)-[:PLAYED]->(:Track)-[:SUNG_BY]->(a:Artist),
      (u2)-[:SCROBBLED]->(:Scrobble)-[:PLAYED]->(:Track)-[:SUNG_BY]->(a)
RETURN a.name AS Artist, COUNT(*) AS Count
ORDER BY Count DESC
LIMIT 10

Result

ArtistCount
Drake15019
Beyoncé6869
Lana Del Rey3723
Jay-Z2967
Nicki Minaj2624
Lil' Wayne2242
Rihanna1911
Daft Punk1560
Explosions in the Sky1165
Tyga663

User nmwhite0131's last five scrobbles.

Query

MATCH (:User {username:'nmwhite0131'})-[:SCROBBLED]->(last:Scrobble),
      recent = (:Scrobble)-[:NEXT*4]->(last)
WHERE NOT ((last)-[:NEXT]->(:Scrobble))
WITH NODES(recent) AS scrobbles
UNWIND scrobbles AS s
MATCH (s)-[:PLAYED]->(t:Track)-[:SUNG_BY]->(a:Artist)
RETURN s.date AS Date, t.name AS Track, a.name AS Artist

Result

DateTrackArtist
2014-09-24EasySaycet
2014-09-24BruyèreSaycet
2014-09-24OpalSaycet
2014-09-24Her MovieSaycet
2014-09-24We Walk FastSaycet

Social

Twitter

What is related, and how?

Tweet    USING     Source
Tweet    CONTAINS  Link
Tweet    MENTIONS  User
Tweet    RETWEETS  Tweet
User     POSTS     Tweet
Tweet    REPLY_TO  Tweet
Hashtag  TAGS      Tweet

How are hashtags 'java' and 'graph' related?

Query

MATCH (java:Hashtag {name:'java'}), (graph:Hashtag {name:'graph'})
WITH java, graph
MATCH p = shortestPath((java)-[:TAGS*]-(graph))
RETURN p

Result

What is the most retweeted tweet, who posted it, and what did it say?

Query

MATCH (:Tweet)-[:RETWEETS]->(t:Tweet)
WITH t, COUNT(*) AS count
ORDER BY count DESC
LIMIT 1

MATCH (u:User)-[:POSTS]->(t)
RETURN u.screen_name AS User, t.text AS Tweet, count AS Retweets

Result

UserTweetRetweets
println_tsA twist on how to do login & registration form on #Neo4J's website. #ux #ui http://t.co/CmDIfkSDT3 http://t.co/fjtSik59Qa5