Anna Divoli (Pingar Research) "How taxonomies and facets bring end-users closer to big data" TAW2012

download Anna Divoli (Pingar Research) "How taxonomies and facets bring end-users closer to big data" TAW2012

of 32

  • date post

    20-Jan-2015
  • Category

    Documents

  • view

    1.207
  • download

    1

Embed Size (px)

description

Talk presented in Text Analytics World 2012 in Boston More Info http://www.textanalyticsworld.com/boston/2012/agenda/full-agenda#day2415-5-2

Transcript of Anna Divoli (Pingar Research) "How taxonomies and facets bring end-users closer to big data" TAW2012

  • 1. How taxonomies and facetsbring end-users closer to big dataAnna Divoli@annadivoliBoston Oct 2012

2. Taxonomies / + (arrangement/class + method/rule/law) hierarchical classification formal nomenclature varied dimensions evaluation/measures/metrics types: manually constructed, social, auto-generated purposes: auto-indexing, search facilitation, navigation,knowledge management, organization. it is OK to change the classification systems to adjust to newknowledge not just adding new concepts the data have become big and available but not accessible many end users Boston Oct 2012 3. User Studies TypesSpecialized domain studies:1. Facets (HCIR): Biomedical ScientistsAnna Divoli and Alyona MedelyanSearch interface feature evaluation in biosciences, HCIR 2011, Google, Mountain View, CA2. Expert needs (media group)UI preferred features studies:3. Existing popular systems (EuroHCIR)Matthew Pike, Max L. Wilson, Anna Divoli and Alyona MedelyanCUES: Cognitive Usability Evaluation System, EuroHCIR 2012, Nijmegen, Netherlands4. Mock ups of specific features (survey) Boston Oct 2012 4. Our studies1. Facets (HCIR): Biomedical Scientists Anna Divoli and Alyona Medelyan Search interface feature evaluation in biosciences, HCIR 2011, Google, Mountain View, CA Boston Oct 2012 5. Facets favorite feature for search systemsAnna Divoli and Alyona Medelyan, Search interface feature evaluation inbiosciences, HCIR 2011, Google, Mountain View, CA, USABoston Oct 2012 6. Facets (in search systems) animal models huntington disease Boston Oct 2012 7. Bio-FacetsMost liked Least likedanimal models huntington disease Boston Oct 2012 8. Facets as search features for biomedical scientists: Findings Faceted search is the most important stand alone feature in a searchinterface for bioscientists. Few, query-oriented facets presented as checkboxes work best. Overly simple aesthetics, although not desirable, do not hurt overallUI score. Complex aesthetics turn users away from the systems. Bioscientists prefer tools that help them narrow their search, notexpand it. For generic search: doc-based facets.For domain-specific search: query-based facets. Boston Oct 2012 9. Search expansionsFacets as search feature: likes & dislikes brffigS Facetted refinement Useful categories + useful categories + quick paper access + topbr - slow functionality Simple + reviews category + simple- tooff - too complex/busy - too many colors - limited functional. Vertical list - poor design+ vertical list- nothing specialig Semedico PubMed Solr Go Related searchesbr - not scientific + colors Too complex/busy+ relevantff - too small - too busyToo many colorsvariety- poor context- noig Bing Poor design PubMed Results preview Limited functionalityToo many symbolsbrffNot special/ Colorlessig Legend+positive commentsBoston Oct 2012 positive 10. Our studies2. Expert needs (media group)Boston Oct 2012 11. Case Study: Media GroupThey have a system/taxonomy in place that nobodymaintains or uses~ 10,000 articles / week, ~5 million in their archives~ 21 years, 10,000 authorsHandful of top categoriesMain reasons/uses:- Advertisement- Packing up stories and selling them- Readers finding stories & related stories- Journalists finding related storiesBoston Oct 2012 12. Expert content needs - Case Study: Media Group Ideally update the taxonomy daily/weekly Must be dynamic & handle new cases/concepts Deep nesting is OK If multiple inheritance, need to disambiguate where aparticular article belongs to Be able to edit (be able to verify , in case of anomaliesbased on automation & move nodes around)Boston Oct 2012 13. Our studies3. Existing popular systems (EuroHCIR) Matthew Pike, Max L. Wilson, Anna Divoli and Alyona Medelyan CUES: Cognitive Usability Evaluation System, EuroHCIR 2012, Nijmegen, Netherlands Boston Oct 2012 14. Exploring UI features - Systems Tested: Yippy, Carrot, MeSH, ESDBoston Oct 2012 15. Exploring UI features - Systems Tested: Yippy, Carrot, MeSH, ESDBoston Oct 2012 16. Exploring UI features - Systems Tested: Yippy, Carrot, MeSH, ESDBoston Oct 2012 17. Exploring UI features - Systems Tested: Yippy, Carrot, MeSH, ESDBoston Oct 2012 18. Exploring UI features - Systems Tested: Yippy, Carrot, MeSH, ESDA B C D E F A B C D E F A B C D E F A B C D E F A B C DE FCFBDAEBoston Oct 2012 19. Exploring UI features (Yippy, Carrot, MeSH, ESD): likes & dislikes Menu highlighting Hierarchical folder layout Expand hierarchy with + and Dual view (tree on left, results on right) Ability to change visualisations of taxonomy Search function is important Familiar interface with folders Too simple or too much writing - would be nice to have color Lots of scrolling Dots in carrot circle confusing Double click on foam tree is unintuitive Too broad taxonomiesBoston Oct 2012 20. Our studies4. Mock ups of specific features (survey)Boston Oct 2012 21. Taxonomy UI preferences (ongoing survey): The (51) participantsAge: How comfortable you are with computers? 25 or younger 27.3% Somewhat5.5% 26-40 60.0% Very47.3% 41-60 12.7%Second nature47.3%61 or older0%Highest level of education:Do you have experience using taxonomies? High School3.6%No30.9%College/University52.7%Yes, but very little 47.3%Graduate School 43.6%Yes21.8% bit.ly/pingar_taxonomies Boston Oct 2012 22. popularity (A)44.2%Concept sorting alphabetically (B) 42.3% no preference 13.5%Boston Oct 2012 23. A 42.3%Displaying Counts B 51.9%no preference 5.8%Boston Oct 2012 24. in frames (A)72.5%Using Labels with labels (B) 23.5% no preference 3.9% Boston Oct 2012 25. A 47.1%Plus/minus signs or arrows B 37.3% no preference 15.7% Boston Oct 2012 26. A 13.7%Search Results Display B 11.8% C 70.6% no preference 3.9% Boston Oct 2012 27. partial 74.5%Search Functionalityhidden64.7% no preference2.0% Boston Oct 2012 28. Where we standOur team works on automatic generated taxonomies but werealized the need for customization for specific needsBoston Oct 2012 29. Taxonomy Taxonomy is described sometimes as a science and sometimes as an art, but really its a battleground. Bill Bryson, A Short History of Nearly Everything Boston Oct 2012 30. T echnology A rta X iomatic phil O sophydesig N l O gic hu M anities lingu I stics E thnonology S cienceBoston Oct 2012 31. Summary There is a place for manually, socially and automaticallygenerated taxonomies (as well as hybrids). Text is big and in many fields dynamic. End-users (not Information Management experts) needaccess to big text. Auto-generated taxonomies with manual editing facilitiesis now possible & makes sense. Domain specific background knowledge is vital for thequality and detail required per solution. User friendly systems are very important for end users. Boston Oct 2012 32. Acknowledgements Alyona Medelyan (Pingar) Max L. Wilson (Swansea/Nottingham) Matthew Pike (Swansea/Pingar) Pingar Brainspingar.com All 65+ anonymous studies participants! Boston Oct 2012