Post on 09-Aug-2020
How taxonomies and facets bring end-‐users closer to big data
Anna Divoli @annadivoli
Boston Oct 2012
Taxonomies
• τάξις/τάξη + νομία (arrangement/class + method/rule/law) • hierarchical classificaIon • formal nomenclature • varied dimensions • evaluaIon/measures/metrics • types: manually constructed, social, auto-‐generated • purposes: auto-‐indexing, search facilitaIon, navigaIon,
knowledge management, organizaIon…. • it is OK to change the classificaIon systems to adjust to new
knowledge – not just adding new concepts • the data have become “big” and available but not accessible • many “end users”
Boston Oct 2012
User Studies Types Specialized domain studies:
1. Facets (HCIR): Biomedical ScienIsts
2. Expert needs (media group)
UI preferred features studies:
3. ExisIng popular systems (EuroHCIR)
4. Mock ups of specific features (survey)
Boston Oct 2012
Anna Divoli and Alyona Medelyan Search interface feature evalua5on in biosciences, HCIR 2011, Google, Mountain View, CA
MaDhew Pike, Max L. Wilson, Anna Divoli and Alyona Medelyan CUES: Cogni5ve Usability Evalua5on System, EuroHCIR 2012, Nijmegen, Netherlands
Boston Oct 2012
Our studies
1. Facets (HCIR): Biomedical ScienIsts
Anna Divoli and Alyona Medelyan Search interface feature evalua5on in biosciences, HCIR 2011, Google, Mountain View, CA
Facets – favorite feature for search systems
Boston Oct 2012
Anna Divoli and Alyona Medelyan, Search interface feature evalua5on in biosciences, HCIR 2011, Google, Mountain View, CA, USA
Boston Oct 2012
Facets (in search systems)
animal models hunIngton disease
Bio-‐Facets Most liked Least liked
Boston Oct 2012
animal models hunIngton disease
Facets as search features for biomedical scienKsts: Findings • Faceted search is the most important stand alone feature in a search
interface for bioscienIsts.
• Few, query-‐oriented facets presented as checkboxes work best. • Overly simple aestheIcs, although not desirable, do not hurt overall
UI score.
• Complex aestheIcs turn users away from the systems.
• BioscienIsts prefer tools that help them narrow their search, not expand it.
• For generic search: doc-‐based facets. For domain-‐specific search: query-‐based facets.
Boston Oct 2012
Facets as search feature: likes & dislikes
Autocomplete
Search expansions★
Facetted refinement
Related searches
Results preview★
+ positive comments - negative comments italics comments on aesthetics
same ranking for both baseline & own query ★ not many systems tested so no low rank ratings
- unhelpful symbols - too complex - too much info - hard to read
Semedico
- no diversity - too complex + highlighting - hard to read
NextBio
- too general + font color & size - unclear presentation - noisy, many symbols
GoPubMed
+ relevant suggestions + good coverage
- blue font color - small font size
PubMed
+ diff types of info + simple
+ highlighting
Bing
+ “review” suggestion +/- simple
+ highlighting + used to
- too general - not useful
+ overall look - color
Semedico
+ good functionality - specialized
- unclear functionality - too complex
PubMed
+ useful - redundancies
+ simple & clean + less options
Pingar
+ useful categories - slow functionality - too complex/busy - too many colors
Semedico
+ “reviews” category - limited functional. - poor design
PubMed
+ quick paper access + simple
+ vertical list - nothing special
Solr
+ “top terms” useful - too many symbols
- too busy - colors
GoPubMed
+ useful categories + simple
+ vertical list - not special, colorless
Pingar DB
+ useful categories + simple
+ vertical list - not special, colorless
Pingar QB
- not scientific + colors - too small - too busy
Bing
+ relevant - poor context - no variety
PubMed
- limited options + clickable + font size
+ few options
Pingar
+ good suggestions - redundancies
+ font color /blue links - too busy
+ specific keywords - snippets lack context
+color + font color
Solr
+ helpful keywords + mouseover
- pale - font size & style
Pingar
positive neutral negative 1 participant !
br: browsing ff: fact finding ig: information gathering
br ff ig
br ff ig
br ff ig
br ff ig
br ff ig
Legend
Ranked Last Ranked First
Boston Oct 2012
• Useful categories • Simple • VerIcal list
• Too complex/busy • Too many colors • Poor design • Limited funcIonality • Too many symbols • Not special/ Colorless
Boston Oct 2012
Our studies
2. Expert needs (media group)
Case Study: Media Group They have a system/”taxonomy” in place that nobody maintains or uses…
~ 10,000 arIcles / week, ~5 million in their archives ~ 21 years, 10,000 authors Handful of top categories
Main reasons/uses: -‐ AdverIsement -‐ Packing up stories and selling them -‐ Readers finding stories & related stories -‐ Journalists finding related stories
Boston Oct 2012
Expert content needs -‐ Case Study: Media Group à Ideally update the taxonomy daily/weekly à Must be dynamic & handle new cases/concepts à Deep nesIng is OK à If mulIple inheritance, need to disambiguate where a
parIcular arIcle belongs to à Be able to edit (be able to verify , in case of anomalies
based on automaIon & move nodes around)
Boston Oct 2012
Boston Oct 2012
Our studies
3. ExisIng popular systems (EuroHCIR)
MaDhew Pike, Max L. Wilson, Anna Divoli and Alyona Medelyan CUES: Cogni5ve Usability Evalua5on System, EuroHCIR 2012, Nijmegen, Netherlands
Exploring UI features -‐ Systems Tested: Yippy, Carrot, MeSH, ESD
Boston Oct 2012
Exploring UI features -‐ Systems Tested: Yippy, Carrot, MeSH, ESD
Boston Oct 2012
Exploring UI features -‐ Systems Tested: Yippy, Carrot, MeSH, ESD
Boston Oct 2012
Exploring UI features -‐ Systems Tested: Yippy, Carrot, MeSH, ESD
Boston Oct 2012
Boston Oct 2012
Exploring UI features -‐ Systems Tested: Yippy, Carrot, MeSH, ESD
A
A B C D E F
B
C
D E
F
A B C D E F A B C D E F A B C D E F A B C D E F
Exploring UI features (Yippy, Carrot, MeSH, ESD): likes & dislikes
Boston Oct 2012
• Menu highlighIng • Hierarchical folder layout • Expand hierarchy with “+” and “–” • Dual view (tree on ler, results on right) • Ability to change visualisaIons of taxonomy • Search funcIon is important • Familiar interface with folders
• Too simple or too much wriIng -‐ would be nice to have color • Lots of scrolling • Dots in carrot circle – confusing • Double click on foam tree is unintuiIve • Too broad taxonomies
Boston Oct 2012
Our studies
4. Mock ups of specific features (survey)
Taxonomy UI preferences (ongoing survey): The (51) parKcipants
Boston Oct 2012
60.0% 26-‐40 12.7% 41-‐60 0% 61 or older
27.3% 25 or younger Age:
52.7% College/University 43.6% Graduate School
3.6% High School Highest level of educaKon:
47.3% Yes, but very liYle 21.8% Yes
30.9% No
Do you have experience using taxonomies?
47.3% Very 47.3% Second nature
5.5% Somewhat
How comfortable you are with computers?
bit.ly/pingar_taxonomies
Concept sorKng
Boston Oct 2012
44.2% popularity (A) 42.3% alphabeKcally (B) 13.5% no preference
Displaying Counts
Boston Oct 2012
42.3% A 51.9% B 5.8% no preference
Using Labels
Boston Oct 2012
72.5% in frames (A) 23.5% with labels (B) 3.9% no preference
Plus/minus signs or arrows
Boston Oct 2012
47.1% A 37.3% B 15.7% no preference
Search Results Display
Boston Oct 2012
11.8% B 70.6% C 3.9% no preference
13.7% A
Search FuncKonality
Boston Oct 2012
�����'''�#%"&�)� ���)���$��)�%"&�)����$ " %����#!(#�� ������ ��)� �(������"��"�����*
�
�
����!���������
����#!!�"'&��#$'�#"� ��
����������
�������� �������������� ��#� ���& ����!� ���$�!�� ���������&
����!���������
������"�+#(�&��%�����'�*#"#!+��#%���'�%!���#�+#(�)�"'�'#�%�'(%"��(&'��*��'�!�'���&�#%�+#(��%���"'�%�&'����"�$�%'�� �!�'���&��"�������"�!�'���&�'##
� ��&�������������'��'��$$ +�
������������
%��!���!��� ����
���!������!��� ����
��������!��� ����
������������
�� ��#� ���& ����!���!��"� !��� '
��#� ���& ����!���!��"� !��� '
�� �����"� !��������� ��#� ���& ����!���!��"� !��� '
��!��������!��� �'
�������"� !��� '
�������"� !��� '
�������"� !��� '
�������"� !��� '
74.5% parKal 64.7% hidden 2.0% no preference
Where we stand
Our team works on automaIc generated taxonomies but we realized the need for customizaIon for specific needs
Boston Oct 2012
“Taxonomy is described someImes as a science and someImes as an art, but really it’s a bayleground.” Bill Bryson, A Short History of Nearly Everything
Boston Oct 2012
Taxonomy
A rt S cience
T echnology A rt a X iomaIc phil O sophy desig N l O gic hu M aniIes lingu I sIcs E thnonology S cience
Boston Oct 2012
Summary • There is a place for manually, socially and automaIcally
generated taxonomies (as well as hybrids). • Text is “big” and in many fields dynamic. • “End-‐users” (not InformaIon Management experts) need
access to “big text”. • Auto-‐generated taxonomies with manual ediIng faciliIes
is now possible & makes sense. • Domain specific background knowledge is vital for the
quality and detail required per soluIon. • User friendly systems are very important for end users.
Boston Oct 2012
Boston Oct 2012
Acknowledgements Alyona Medelyan (Pingar) Max L. Wilson (Swansea/No{ngham) Mayhew Pike (Swansea/Pingar) Pingar Brains All 65+ anonymous studies parIcipants!
pingar.com