Anna Divoli (Pingar Research) "How taxonomies and facets bring end-users closer to big data" TAW2012

Post on 20-Jan-2015

1.208 views 1 download

description

Talk presented in Text Analytics World 2012 in Boston More Info http://www.textanalyticsworld.com/boston/2012/agenda/full-agenda#day2415-5-2

Transcript of Anna Divoli (Pingar Research) "How taxonomies and facets bring end-users closer to big data" TAW2012

How taxonomies and facets bring end-users closer to big data

Anna Divoli@annadivoli

Boston Oct 2012

Taxonomies

• τάξις/τάξη + νομία (arrangement/class + method/rule/law)• hierarchical classification • formal nomenclature • varied dimensions • evaluation/measures/metrics• types: manually constructed, social, auto-generated• purposes: auto-indexing, search facilitation, navigation,

knowledge management, organization….• it is OK to change the classification systems to adjust to new

knowledge – not just adding new concepts • the data have become “big” and available but not accessible• many “end users”

Boston Oct 2012

User Studies Types

Specialized domain studies:

1. Facets (HCIR): Biomedical Scientists

2. Expert needs (media group)

UI preferred features studies:

3. Existing popular systems (EuroHCIR)

4. Mock ups of specific features (survey)

Boston Oct 2012

Anna Divoli and Alyona Medelyan Search interface feature evaluation in biosciences, HCIR 2011, Google, Mountain View, CA

Matthew Pike, Max L. Wilson, Anna Divoli and Alyona MedelyanCUES: Cognitive Usability Evaluation System, EuroHCIR 2012, Nijmegen, Netherlands

Boston Oct 2012

Our studies

1. Facets (HCIR): Biomedical ScientistsAnna Divoli and Alyona Medelyan Search interface feature evaluation in biosciences, HCIR 2011, Google, Mountain View, CA

Facets – favorite feature for search systems

Boston Oct 2012

Anna Divoli and Alyona Medelyan, Search interface feature evaluation in biosciences, HCIR 2011, Google, Mountain View, CA, USA

Boston Oct 2012

Facets (in search systems)

animal models huntington disease

Bio-Facets Most liked Least liked

Boston Oct 2012

animal models huntington disease

Facets as search features for biomedical scientists: Findings

• Faceted search is the most important stand alone feature in a search interface for bioscientists.

• Few, query-oriented facets presented as checkboxes work best.

• Overly simple aesthetics, although not desirable, do not hurt overall UI score.

• Complex aesthetics turn users away from the systems.

• Bioscientists prefer tools that help them narrow their search, not expand it.

• For generic search: doc-based facets. For domain-specific search: query-based facets.

Boston Oct 2012

Facets as search feature: likes & dislikes

Boston Oct 2012

• Useful categories• Simple• Vertical list

• Too complex/busy• Too many colors• Poor design• Limited functionality • Too many symbols• Not special/ Colorless

Boston Oct 2012

Our studies

2. Expert needs (media group)

Case Study: Media Group

They have a system/”taxonomy” in place that nobody maintains or uses…

~ 10,000 articles / week, ~5 million in their archives~ 21 years, 10,000 authorsHandful of top categories

Main reasons/uses: - Advertisement- Packing up stories and selling them- Readers finding stories & related stories- Journalists finding related stories

Boston Oct 2012

Expert content needs - Case Study: Media Group

Ideally update the taxonomy daily/weekly Must be dynamic & handle new cases/concepts Deep nesting is OK If multiple inheritance, need to disambiguate where a

particular article belongs to Be able to edit (be able to verify , in case of anomalies

based on automation & move nodes around)

Boston Oct 2012

Boston Oct 2012

Our studies

3. Existing popular systems (EuroHCIR)Matthew Pike, Max L. Wilson, Anna Divoli and Alyona MedelyanCUES: Cognitive Usability Evaluation System, EuroHCIR 2012, Nijmegen, Netherlands

Exploring UI features - Systems Tested: Yippy, Carrot, MeSH, ESD

Boston Oct 2012

Exploring UI features - Systems Tested: Yippy, Carrot, MeSH, ESD

Boston Oct 2012

Exploring UI features - Systems Tested: Yippy, Carrot, MeSH, ESD

Boston Oct 2012

Exploring UI features - Systems Tested: Yippy, Carrot, MeSH, ESD

Boston Oct 2012

Boston Oct 2012

Exploring UI features - Systems Tested: Yippy, Carrot, MeSH, ESD

A

A B C D E F

B

C

DE

F

A B C D E F A B C D E F A B C D E F A B C D E F

Exploring UI features (Yippy, Carrot, MeSH, ESD): likes & dislikes

Boston Oct 2012

• Menu highlighting• Hierarchical folder layout• Expand hierarchy with “+” and “–”• Dual view (tree on left, results on right)• Ability to change visualisations of taxonomy• Search function is important• Familiar interface with folders

• Too simple or too much writing - would be nice to have color• Lots of scrolling • Dots in carrot circle – confusing• Double click on foam tree is unintuitive• Too broad taxonomies

Boston Oct 2012

Our studies

4. Mock ups of specific features (survey)

Taxonomy UI preferences (ongoing survey):

The (51) participants

Boston Oct 2012

60.0%26-4012.7%41-600%61 or older

27.3%25 or youngerAge:

52.7%College/University43.6%Graduate School

3.6%High School

Highest level of education:

47.3%Yes, but very little21.8%Yes

30.9%No

Do you have experience using taxonomies?

47.3%Very47.3%Second nature

5.5%Somewhat

How comfortable you are with computers?

bit.ly/pingar_taxonomies

Concept sorting

Boston Oct 2012

44.2%popularity (A)42.3%alphabetically (B)13.5%no preference

Displaying Counts

Boston Oct 2012

42.3%A51.9%B5.8%no preference

Using Labels

Boston Oct 2012

72.5%in frames (A)23.5%with labels (B)3.9%no preference

Plus/minus signs or arrows

Boston Oct 2012

47.1%A37.3%B15.7%no preference

Search Results Display

Boston Oct 2012

11.8%B70.6%C3.9%no preference

13.7%A

Search Functionality

Boston Oct 2012

74.5%partial64.7%hidden2.0%no preference

Where we stand

Our team works on automatic generated taxonomies but we realized the need for customization for specific needs

Boston Oct 2012

“Taxonomy is described sometimes as a science and sometimes as an art, but really it’s a battleground.”

Bill Bryson, A Short History of Nearly Everything

Boston Oct 2012

Taxonomy

A rt

S cience

T echnology A rt a X iomatic phil O sophy desig N l O gic hu M anities lingu I stics E thnonology S cience

Boston Oct 2012

Summary

• There is a place for manually, socially and automatically generated taxonomies (as well as hybrids).

• Text is “big” and in many fields dynamic.• “End-users” (not Information Management experts) need

access to “big text”.• Auto-generated taxonomies with manual editing facilities

is now possible & makes sense.• Domain specific background knowledge is vital for the

quality and detail required per solution.• User friendly systems are very important for end users.

Boston Oct 2012

Boston Oct 2012

Acknowledgements

Alyona Medelyan (Pingar)Max L. Wilson (Swansea/Nottingham)Matthew Pike (Swansea/Pingar)

Pingar Brains

All 65+ anonymous studies participants!pingar.com