CSE 111 Green: Program Design I Lecture 15: Modules, not come up in CS 111 Green this semester.)...
date post
17-Feb-2018Category
Documents
view
213download
0
Embed Size (px)
Transcript of CSE 111 Green: Program Design I Lecture 15: Modules, not come up in CS 111 Green this semester.)...
CSE111Green:ProgramDesignILecture15:
Modules,plo2ng,andmore
GuestLecturer:Prof.JoeHummel
RobertH.Sloan(CS)&RachelPoretsky(Bio)UniversityofIllinois,Chicago
October17,2017
PYTHON STANDARD LIBRARY & BEYOND: MODULES
Extending Python
n Every modern programming language has way to extend basic functions of language with new ones
n Python: importing a module n module: Python file with new capabilities defined in it n One you import module, it's as if you typed it in: you get all
functions, objects, variables defined in it immediately
Python Standard Library
n Python always comes with big set of modules n List at https://docs.python.org/3/py-modindex.html n Examples
csv Read/write csv files datetime Basic date & time types math Math stuff (e.g., sin(), cos(), sqrt() ) os E.g., list files in your operating system random random number generation urllib Open URLs, parse URLs
BTW, did we mention
n You will probably need csv for next Monday's lab n And random for Project 2 n And will use other modules too
csv Read/write csv files datetime Basic date & time types math Math stuff (e.g., sin(), cos(), sqrt() ) os E.g., list files in your operating system random random number generation urllib Open URLs, parse URLs
Using Modules
n Use import to make module's function's available
n Style: Put all import statements at top of file n After importmodule_name, access its functions (and
variables, etc.) through module_name.function_namen If module_name is long, can abbreviate in import with as:
q importmodule_nameasmnq mn.function_name
n Thereareafewlong-namedcommonmoduleswhereeverybodydoesthis
If you prefer to save typing
n (I mostly do not do this) n To accessfunction_namewithout having to type module_name prefix, use:
frommodule_nameimportfunction_name
Dot notation remark
n Python makes two different uses of dot notation q methods, where as we've seen, we call method as
n obj_name.method_name()
q functions in modules n module_name.function_name
Common but not standard
n matplotlib and pandas 2 (of many) examples of modules not among modules required to come with every Python 3 q matplotlib: Drawing graphs (in style of Matlab) q We will do some work with it in our next lab q pandas: Data science stuff; won't use it in this course
n matplotlib very, very widely used, and pandas widely used n Both among many modules that come with Anaconda
distribution of Python 3
We all can make modules for ourselves
n Modules used to group functions q Both standard library or matplotlib and modules we write ourselves q Very useful for clarity and reuse as overall project sizes get larger q Not so need for your own modules in CS 111
n Any file ending in .py can act as module
CSV FILES AND A BIT MORE ON FILES GENERALLY
Files and real-world data ()
n Open has an optional third argument, specifying a character encoding
n Irrelevant most of the time n But you may need it if you are working with Arabic, Greek
Hebrew, Mandarin, Russian, etc. n Or purely English materials using oddball symbols like n (About 90% of files with or have character encoding assumed
with not 3rd argument, but you could get one of the other 10%. Should not come up in CS 111 Green this semester.)
Files and real-world data (): CSV
n Structured text! In 2017, often want to communicate between all sorts of different electronic tools
n CSV (comma-separated values) is format used by Excel, and very common for exchanging large collections of data
n Python has a csv module and it has csv.writer() and csv.reader() functions that could help you q Lab next week will have ecology data science flavor, and probably it
will have csv input
CSV data: LHS is (real) Excel spreadsheet
Fall Semester UG Majors 1 Yr % Inc.2006 2152007 242 12.6%2008 252 4.1%2009 286 13.5%2010 318 11.2%2011 328 3.1%2012 385 17.4%2013 493 28.1%2014 594 20.5%2015 701 18.0%2016 843 20.3%
2017 est. 970 15.1%2017 rev. est. 1063 26.1%
n And Excel can save it as CSV: Fall Semester,UG Majors,1 Yr % Inc. 2006,215, 2007,242,12.6% 2008,252,4.1% 2009,286,13.5% 2010,318,11.2% 2011,328,3.1% 2012,385,17.4% 2013,493,28.1% 2014,594,20.5% 2015,701,18.0% 2016,843,20.3% 2017 est.,970,15.1% 2017 rev. est. ,1063,26.1%
For the record CSV format
n Format is comma separating each value in a row; newline to end rows q And can specify to use something else instead of comma
n But 2017 data science work mostly goes on just knowing that it is a format that lots of software knows how to handle and we don't have to know that
n Assuming we use csv module n Would have to know if just open() followed by readlines()
Reading in csv data (coming to a lab or project near you)
n importcsvn Open file as usual:
q fileref=open("filename.csv","r")
n Thencreatecsvreaderobjectfromthefileref:q data_reader=csv.reader(fileref)
n Use for loop to iterate over that, each row list of strings q forrowindata_reader:q #rowislistofstrings,1perentryinrowq #processlistwithfororwithindexing
Similarly, csv writer objects
n If you need to write a csv file, there is an analagous csv.writer function
n and a csv writer object like q wr = csv.writer(filerf) q that has methods writerow() and writerows()
Programming: A Superpower
n Why write Python programs and not just use Excel? 1. We can write a program that computes anything, not just
what is built into Excel (all this biology just one example!) 2. Excel not built for big data; Python is
q Chicago crime data Prof. Sloan used in some security and privacy research has 1,048,576 million rows, 18 columns
q Python: creating csv.reader and looping over all the rows to count them: 1 second (Sloan's 2013 laptop)
q Open file in Excel: several minutes n Just resize one column for better viewing: 5-30 sec
Before leaving files: with as
n "Oh, sweetie, you left the refrigerator door openagain!" q Snarl!
n It is really bad practice to open files without closing them q Messing with the computer's file system (more in CS 361) q Typically nothing bad happens with size programs we write in
CS 111, but it could, and it's still a bad practice n Better Python construct than open/close: with as,
guaranteeing we never forget to close what we've opened q Opens and closes; get file only inside block
with as syntax example
withopen("proteins.aa","r")asfileref:#usefilerefhereforlineinfileref:#couldhavebeenfileref.read()etc.#Nocloseneeded
with as considered better style than open close
n Because it makes it impossible to forget the close
Another useful & interesting module: random (Attention: Not in book, in upcoming lab/project) >>>importrandom>>>foriinrange(5):...print(random.random())...0.126366640291652680.28212728895355120.61600319401875430.286090069819085250.6277074518401735n Notice: We're using the function named random from module named random, hence
random.random()
Commonly used random functions
n random.random() # takes no input q returns pseudorandom float between 0.0 and 1.0
n random.uniform(a, b) q returns float pseudorandomly chosen from between a and b
n random.choice(ls) # gets list as input q returns psudorandomly chosen element of the list
Randomly choosing words from a list
>>>foriinrange(5):...print(random.choice(["Here","is","a","list","of","words","in","random","order"]))...listwordsinHerelist
Randomly generating language
n Given a list of nouns, verbs that agree in tense and number, and object phrases that all match the verb,
n We can randomly take one from each to make sentences.
importrandomdefexcuse():excuse=["Ididn'tknowIwasinthisclass","IthoughtIalready
graduated","Igotstuckinablizzard"]bigNum=["4","17","likeabillion","mega","tonsof"]lottaWork=["midterms","Ph.D.theses","programs"]print("Ineedanextensionbecause",random.choice(excuse),"andIhad",
random.choice(bigNum),random.choice(lottaWork),"todo.")
Side note: Good example of a function that should have 0 inputs and no return value.
Running random sentence generator
>>>excuse()IneedanextensionbecauseIthoughtIalreadygraduatedandIhadlikeabillionprogramstodo.>>>excuse()IneedanextensionbecauseIgotstuckinablizzardandIhad4programstodo.>>>excuse()IneedanextensionbecauseIgotstuckinablizzardandIhad17programstodo.>>>excuse()IneedanextensionbecauseIthoughtIalreadygraduatedandIhadtonsofprogramstodo.>>>excuse()IneedanextensionbecauseIdidntknowIwasinthisclassandIhad17Ph.D.thesestodo.
Choosing randomly from a population
We can sample using random module's choice here too
>>> import random>>> random.choice(pop_list)"A">>> pop_list["A","A","a","A","a"] # Didn't change the original list
>>>pop_list=["A","A","a","A","a"] #Will be part of Proj 2
MATPLOTLIB MODULE Drawing graphs
A Picture is Worth 1000 Excel cells Year,Annual anomaly,Lower 95% confidence interval,Upper 95% confidence interval 1880,-0.4700088,-0.672646261,-0.267371339 1881,-0.3568788,-0.560588343,-0.153169257 1882,-0.3726612,-0.575728173,-0.169594227 1883,-0.448443,-0.650803864,-0.246082136 1884,-0.5897538,-0.790478088,-0.389029512 1885,-0.6636546,-0.