Build Your Own World Class Directory Search From Alpha to Omega

87
Copyright © President & Fellows of Harvard College Build Your Own World-Class Directory Search From Α to Ω Ravi Mynampaty

Transcript of Build Your Own World Class Directory Search From Alpha to Omega

Page 1: Build Your Own World Class Directory Search From Alpha to Omega

Copyright © President & Fellows of Harvard College

Build Your Own World-Class Directory Search From Α to Ω

Ravi Mynampaty

Page 2: Build Your Own World Class Directory Search From Alpha to Omega

About Ravi

A hustler making a living by pretending to know more about

Enterprise Search than he actually does...

Page 3: Build Your Own World Class Directory Search From Alpha to Omega
Page 4: Build Your Own World Class Directory Search From Alpha to Omega

“I can live on a good compliment two weeks with nothing else to eat...”

@RaviMynampaty

Page 5: Build Your Own World Class Directory Search From Alpha to Omega

Why the heck should I listen to Ravi?

Page 6: Build Your Own World Class Directory Search From Alpha to Omega
Page 7: Build Your Own World Class Directory Search From Alpha to Omega

Agenda

Why?

What’s the fuss about?

What features?

What data?

How?

Page 8: Build Your Own World Class Directory Search From Alpha to Omega

Where are we going?

Search IndexModel &Structure

Raw data

Prototype UI

Page 9: Build Your Own World Class Directory Search From Alpha to Omega

My goal… (how many iterations?)

Page 10: Build Your Own World Class Directory Search From Alpha to Omega

Icebreaker

Page 11: Build Your Own World Class Directory Search From Alpha to Omega

Why?

Page 12: Build Your Own World Class Directory Search From Alpha to Omega
Page 13: Build Your Own World Class Directory Search From Alpha to Omega
Page 14: Build Your Own World Class Directory Search From Alpha to Omega
Page 15: Build Your Own World Class Directory Search From Alpha to Omega
Page 16: Build Your Own World Class Directory Search From Alpha to Omega

What’s the big deal?

Page 17: Build Your Own World Class Directory Search From Alpha to Omega

Records vs.

Documents

Page 18: Build Your Own World Class Directory Search From Alpha to Omega

vs.

Personfamily_namefirst_namephoneemail...

DocumentTitle: ...Description: ..Content: ……..

………….

………….

………….

………….

………….

………….

………….

……

Page 19: Build Your Own World Class Directory Search From Alpha to Omega

Content: ……..……

…….……

…….……

…….……

…….……

…….……

…….……

…….……

…….……

…….……

…….……

Content: ……..……

…….……

…….……

…….……

…….……

…….……

…….……

…….……

…….……

…….……

…….

vs.

Personfamily_namefirst_namephoneemail...

DocumentTitle: ...Description: ..Content: ……..

………….

………….

………….

………….

………….

………….

………….

……

Page 20: Build Your Own World Class Directory Search From Alpha to Omega

Nicknames

Page 21: Build Your Own World Class Directory Search From Alpha to Omega

Predictable• Elizabeth

• Beth, Bess, Betty, Liz

• Richard

• Rich, Dick

• David

• Dave

Page 22: Build Your Own World Class Directory Search From Alpha to Omega

Simple Substrings

Srinivas → “Srini”

Mohammad → “Mo”

Page 23: Build Your Own World Class Directory Search From Alpha to Omega

Somewhat Predictable

Yakub → “Jacob”

Yusuf → “Joseph”

Xian → “Sean”

Page 24: Build Your Own World Class Directory Search From Alpha to Omega

Unpredictable

Hanuman → “Hank”

Madhav → “Mike”

Babu → “Bob”

Wongsu → “Richard”

Herman → “Dutch”

Page 25: Build Your Own World Class Directory Search From Alpha to Omega

Abbreviations & Acronyms

Page 26: Build Your Own World Class Directory Search From Alpha to Omega

Department Names

Information Technology

ITG

Info Tech.

HBS IT

Page 27: Build Your Own World Class Directory Search From Alpha to Omega

Job Titles

CEO

PM

VP

...

Page 28: Build Your Own World Class Directory Search From Alpha to Omega

Educational Degrees

PhD

JD

...

ALM → “Master of Liberal Arts”

(magistri in artibus liberalibus studiorum prolatorum)

Page 29: Build Your Own World Class Directory Search From Alpha to Omega

Substrings

Page 30: Build Your Own World Class Directory Search From Alpha to Omega

Experiment

Page 31: Build Your Own World Class Directory Search From Alpha to Omega

I’d like you to meet...

Page 32: Build Your Own World Class Directory Search From Alpha to Omega

Garoppolo

Page 33: Build Your Own World Class Directory Search From Alpha to Omega

What was that guy’s name?

Page 34: Build Your Own World Class Directory Search From Alpha to Omega
Page 35: Build Your Own World Class Directory Search From Alpha to Omega

What did you search for?

Page 36: Build Your Own World Class Directory Search From Alpha to Omega

One more...

Page 37: Build Your Own World Class Directory Search From Alpha to Omega

Roethlisberger

Page 38: Build Your Own World Class Directory Search From Alpha to Omega

What was that guy’s name?

Page 39: Build Your Own World Class Directory Search From Alpha to Omega
Page 40: Build Your Own World Class Directory Search From Alpha to Omega

What did you search for?

Page 41: Build Your Own World Class Directory Search From Alpha to Omega

My prediction...

Page 42: Build Your Own World Class Directory Search From Alpha to Omega

G…. & R...

Page 43: Build Your Own World Class Directory Search From Alpha to Omega

Jimmy Garoppolo

Page 44: Build Your Own World Class Directory Search From Alpha to Omega

Ben Roethlisberger

Page 45: Build Your Own World Class Directory Search From Alpha to Omega

Exercise!

Page 46: Build Your Own World Class Directory Search From Alpha to Omega

Wishlist: How should search work?

Search mechanisms

What should be searchable?

How should users be able to search?

Query interface: What should be supported?

Results interface: What should be displayed?

etc.

Page 47: Build Your Own World Class Directory Search From Alpha to Omega

Wishlist discussion

Page 48: Build Your Own World Class Directory Search From Alpha to Omega

Features

Search by:

Name, Department, Email, Job title, Phone number

Nicknames, Aliases

Substrings

Scoped search, Sort options

Faceting/filtering options

Spelling suggestions, Autocomplete, Devices, Voice search

Page 49: Build Your Own World Class Directory Search From Alpha to Omega

Hands-on!

Page 50: Build Your Own World Class Directory Search From Alpha to Omega

Install Sublime Text

https://www.sublimetext.com/3

Page 51: Build Your Own World Class Directory Search From Alpha to Omega

Let’s create some data

Page 52: Build Your Own World Class Directory Search From Alpha to Omega
Page 53: Build Your Own World Class Directory Search From Alpha to Omega

Solr XML<add>

<doc><field name="id">1813-05-05</field><field name="LastName">Kierkegaard</field><field name="FirstName">Søren</field>

</doc>

<doc><field name="id">1966-12-14</field><field name="LastName">Thorning-Schmidt</field><field name="FirstName">Helle</field>

</doc></add>

Page 54: Build Your Own World Class Directory Search From Alpha to Omega

That was just for practice

Page 55: Build Your Own World Class Directory Search From Alpha to Omega

Our Dataset: Members of US Congress

Need to create XML for 400+ people records

Page 56: Build Your Own World Class Directory Search From Alpha to Omega

Install JDK 1.8

http://tinyurl.com/ie17java

(set JAVA_HOME env variable)

java -version

javac -version

echo $JAVA_HOME // *nix

echo %JAVA_HOME% // windows

Page 57: Build Your Own World Class Directory Search From Alpha to Omega

Install Fusion

https://lucidworks.com/

Download + Unzip

Run it!

Open cmd prompt

cd ...\fusion-3.0.0\fusion\3.0.0\bin

Page 58: Build Your Own World Class Directory Search From Alpha to Omega

Run it: “fusion.cmd start”

C:\..\Desktop\fusion-3.0.0\fusion\3.0.0\bin>fusion.cmd startStarting zookeeper..Successfully started zookeeper on port 9983 (process ID 144Starting solr..............Successfully started solr on port 8983 (process ID 19564)Starting api............................Successfully started api on port 8765 (process ID 12568)Starting connectors..........................Successfully started connectors on port 8984 (process ID 18Starting ui.............Successfully started ui on port 8764 (process ID 14096)

Page 59: Build Your Own World Class Directory Search From Alpha to Omega

Admin UI: http://localhost:8764/

1. Create password

Follow along with me:

1. Quickstart

2. Create a new collection (call it “Test1”)

3. Select a dataset: “Revolution Session Data”

4. Try some searches

5. Add faceted search

Page 60: Build Your Own World Class Directory Search From Alpha to Omega

Break

Page 61: Build Your Own World Class Directory Search From Alpha to Omega

1st Matrix

Page 62: Build Your Own World Class Directory Search From Alpha to Omega
Page 63: Build Your Own World Class Directory Search From Alpha to Omega
Page 64: Build Your Own World Class Directory Search From Alpha to Omega

1st Matrix: matrix1.xml<doc>

<field name="PersonId">Gabbard, Tulsi</field><field name="LastName">Gabbard</field><field name="FirstName">Tulsi</field><field name="State">Hawaii</field><field name="District">2nd District</field><field name="Room">1433 LHOB</field><field name="Phone">202-225-4906</field><field name="Party">Democratic</field><field name="Committee">Armed Services</field><field name="Email">[email protected]</field>

</doc>

Page 65: Build Your Own World Class Directory Search From Alpha to Omega

Matrix XML

http://tinyurl.com/ie17matrix

Page 66: Build Your Own World Class Directory Search From Alpha to Omega

Create Solr collection for US Congress

http://localhost:8764/

Devops → New → Collection Name “house” → Save Collection

Configure Fields

Create Datasource → Add → Filesystem → SolrXML → Datasource ID

Path: set path to XML file on disk: C:\cygwin64\home\rmynampaty\house\matrix1.xml

Start Crawl → (Wait for finish) → Job History → (Observe success/fail)

Page 67: Build Your Own World Class Directory Search From Alpha to Omega

Let’s search!

Query Workbench

Format Results → Documents

- One Primary Field (which one do you think?)

- One Secondary

- One Other

Page 68: Build Your Own World Class Directory Search From Alpha to Omega

_s vs. _t

String:

Preserves entirely: no tokenizing, preserve case

text:

Tokenizes, stopwords, lowercase

Page 69: Build Your Own World Class Directory Search From Alpha to Omega

Query Workbench

Try some searches: are they generally working?

Sort: what sort makes sense for people search?

Page 70: Build Your Own World Class Directory Search From Alpha to Omega

Scoped Search

<field_name>:<search_string>

e.g.,

State_s:Hawaii

Page 71: Build Your Own World Class Directory Search From Alpha to Omega

Booleans

AND, "+", OR, NOT and "-"

e.g.,

LastName_s:Smith AND Party_s:Republican

Page 72: Build Your Own World Class Directory Search From Alpha to Omega

Fuzzy Matching

e.g.,

Castor~

Castor~0.8

Page 73: Build Your Own World Class Directory Search From Alpha to Omega

Synonyms

Page 74: Build Your Own World Class Directory Search From Alpha to Omega
Page 75: Build Your Own World Class Directory Search From Alpha to Omega

No one uses scoped / Boolean :-(

Page 76: Build Your Own World Class Directory Search From Alpha to Omega

Trick them!

Page 77: Build Your Own World Class Directory Search From Alpha to Omega

Facets

What facets make sense for people search? Add some.

Page 78: Build Your Own World Class Directory Search From Alpha to Omega

2nd Matrix

Page 79: Build Your Own World Class Directory Search From Alpha to Omega

2nd Matrix: matrix2.xml<doc>

<field name="PersonId">Gabbard, Tulsi</field><field name="LastInitial">G</field><field name="LastName">Gabbard</field><field name="FirstName">Tulsi</field><field name="Nickname">POTUS2024</field><field name="State">Hawaii</field><field name="District">2nd District</field><field name="Room">1433 LHOB</field><field name="Phone">202-225-4906</field><field name="Party">Democratic</field><field name="Committee">Armed Services</field><field name="Email">[email protected]</field>

</doc>

Page 80: Build Your Own World Class Directory Search From Alpha to Omega

Recrawl Solr collection for US Congress

http://localhost:8764/

Devops → Collection Name → Datasource → Clear Datasource

Path: set path to XML file on disk: C:\cygwin64\home\rmynampaty\house\matrix2.xml

Start Crawl → (Wait for finish) → Job History → (Observe success/fail)

Search using Query Workbench

Page 81: Build Your Own World Class Directory Search From Alpha to Omega

Substrings

Did they work?

Page 82: Build Your Own World Class Directory Search From Alpha to Omega

3rd Matrix

Page 83: Build Your Own World Class Directory Search From Alpha to Omega

3rd Matrix: matrix3.xmlGabbard

- Gabbard

- gabbar

- gabba

- gabb

- gab

- ga

- g

Substrings via

N-grams

Page 84: Build Your Own World Class Directory Search From Alpha to Omega

Recrawl Solr collection for US Congress

http://localhost:8764/

Devops → Collection Name → Datasource → Clear Datasource

Path: set path to XML file on disk: C:\cygwin64\home\rmynampaty\house\matrix3.xml

Start Crawl → (Wait for finish) → Job History → (Observe success/fail)

Search using Query Workbench

Page 85: Build Your Own World Class Directory Search From Alpha to Omega

Where are we?

Search IndexModel &Structure

Raw data

Prototype UI

Page 86: Build Your Own World Class Directory Search From Alpha to Omega

Next steps for you

Search IndexModel &Structure

End-user UI

Raw data

PrototypeUI

Page 87: Build Your Own World Class Directory Search From Alpha to Omega

Thank you!Questions?

[email protected]@RaviMynampatylinkedin.com/in/mynampatyfacebook.com/ravi.mynampaty