CMSC 330: Organization of Programming Languages Type Systems Implementation Topics

download CMSC 330: Organization of Programming Languages Type Systems Implementation Topics

of 59

  • date post

  • Category


  • view

  • download


Embed Size (px)

Transcript of CMSC 330: Organization of Programming Languages Type Systems Implementation Topics

PowerPoint Presentation

CMSC 330: Organization of Programming LanguagesType SystemsImplementation Topics1CMSC 3302Review: Lambda CalculusA lambda calculus expression is defined as

e ::= xvariable | x.efunction | e efunction application

x.e is like (fun x -> e) in OCaml2CMSC 3303Review: Beta-ReductionWhenever we do a step of beta reduction...(x.e1) e2 e1[x/e2]...alpha-convert variables as necessary

Examples:(x.x (x.x)) z = (x.x (y.y)) z z (y.y)(x.y.x y) y = (x.z.x z) y z.y z3CMSC 3304The Need for TypesConsider untyped lambda calculusfalse = x.y.y0 = x.y.ySince everything is encoded as a function...We can easily misuse termsfalse 0 y.yif 0 then ...Everything evaluates to some functionThe same thing happens in assembly languageEverything is a machine word (a bunch of bits)All operations take machine words to machine words4CMSC 3305What is a Type System?A type system is some mechanism for distinguishing good programs from badGood = well typedBad = ill typed or not typable; has a type error

Examples0 + 1// well typedfalse 0// ill-typed; cant apply a boolean5CMSC 3306Static versus Dynamic TypingIn a static type system, we guarantee at compile time that all program executions will be free of type errorsOCaml and C have static type systems

In a dynamic type system, we wait until runtime, and halt a program (or raise an exception) if we detect a type errorRuby has a dynamic type system6CMSC 3307Simply-Typed Lambda Calculuse ::= n | x | x:t.e | e eWeve added integers n as primitivesWithout at least two distinct types (integer and function), cant have any type errorsFunctions now include the type of their argumentt ::= int | t tint is the type of integerst1 t2 is the type of a function that takes arguments of type t1 and returns a result of type t2t1 is the domain and t2 is the rangeNotice this is a recursive definition, so that we can give types to higher-order functions7CMSC 3308Type JudgmentsWe will construct a type system that proves judgments of the form

A e : t

In type environment A, expression e has type t

If for a program e we can prove that it has some type, then the program type checksOtherwise the program has a type error, and well reject the program as bad8CMSC 3309Type EnvironmentsA type environment is a map from variables names to their types

is the empty type environment

A, x:t is just like A, except x now has type t

When we see a variable in the program, well look up its type in the environment9CMSC 33010Type RulesA n : inte ::= n | x | x:t.e | e eA x : tx:t AA x:t.e : t t'A, x : t e : t'A e e' : t'A e : t t'A e' : tTIntTFunTVarTApp10CMSC 33011ExampleA (x:int.+ x 3) 4 : intA = { + : int int int }A (x:int.+ x 3) : int intA 4 : intB + x 3 : intB 3 : intB + : iii

B + x : int intB x : intB = A, x : int11CMSC 33012DiscussionThe type rules provide a way to reason about programs (ie. a formal logic)The tree of judgments we just saw is a kind of proof in this logic that the program has a valid type

So the type checking problem is like solving a jigsaw puzzleCan we apply the rules to a program in such a way as to produce a typing proof?We can do this automatically12CMSC 33013An Algorithm for Type Checking(Write this in OCaml!)TypeCheck : type env expression type

TypeCheck(A, n) = intTypeCheck(A, x) = if x in A then A(x) else failTypeCheck(A, x:t.e) = let t' = TypeCheck((A, x:t), e) in t t'TypeCheck(A, e1 e2) =let t1 = TypeCheck(A, e1) inlet t2 = TypeCheck(A, e2) in if dom(t1) = t2 then range(t1) else fail13CMSC 33014Type InferenceWe could extend the rules to allow the type checker to deduce the types of every expression in a program even without the annotationsThis is called type inferenceNot covered in this class14CMSC 33015PracticeReduce the following:(x.y.x y y) (a.a) b(or true) (and true false)(* 1 2) (* m n = M.N.x.(M (N x)) )

Derive and prove the type of:A (f:int->int.n:int.f n) (x:int. + 3 x) 6

x:int->int->int. y:int->int. z:int.x z (y z)A = { + : int int int }15Review of CMSC 330SyntaxRegular expressionsFinite automataContext-free grammarsSemanticsOperational semanticsLambda calculusImplementationNames and scopeEvaluation orderConcurrencyGenericsExceptionsGarbage collectionCMSC 3301616Implementation:Names and ScopeCMSC 330CMSC 33018Names and BindingPrograms use names to refer to thingsE.g., in x = x + 1, x refers to a variable

A binding is an association between a name and what it refers toint x; /* x is bound to a stack location containing an int */int f (int) { ... } /* f is bound to a function */class C { ... } /* C is bound to a class */let x = e1 in e2 (* x is bound to e1 *)

18CMSC 33019Name RestrictionsLanguages often have various restrictions on names to make lexing and parsing easierNames cannot be the same as keywords in the languageSometimes case is restrictedNames generally cannot include special characters like ; , : etcUsually names are upper- and lowercase letters, digits, and _ (where the first character cant be a digit)19CMSC 33020Names and ScopesGood names are a precious commodityThey help document your codeThey make it easy to remember what names correspond to what entities

We want to be able to reuse names in different, non-overlapping regions of the code20CMSC 33021Names and Scopes (contd)A scope is the region of a program where a binding is activeThe same name in a different scope can have a different binding

A name is in scope if it's bound to something within the particular scope were referring to

Two names bound to the same object are aliases21CMSC 33022Examplei is in scopein the body of w, the body of y, and after the declaration of j in zbut all those is are different

j is in scopein the body of x and zvoid w(int i) { ...}

void x(float j) { ...}

void y(float i) { ...}

void z(void) { int j; char *i; ...}22CMSC 33023Ordering of BindingsLanguages make various choices for when declarations of things are in scopeGenerally, all declarations are in scope from the declaration onwardWhat about function calls? int x = 0; int f() { return x; } int g() { int x = 1; return f(); }What is the result of calling g()?23CMSC 33024Static ScopeIn static scoping, a name refers to its closest binding, going from inner to outer scope in the program textLanguages like C, C++, Java, Ruby, and OCaml are statically scopedint i;

{ int j;

{ float i;

j = (int) i; }}24CMSC 33025Dynamic ScopeIn a language with dynamic scoping, a name refers to its closest binding at runtime25CMSC 33026Ordering of BindingsBack to the example:

int x = 0; int f() { return x; } int g() { int x = 1; return f(); }

What is the result of calling g() ...... with static scoping?... with dynamic scoping?26CMSC 33027Static vs. Dynamic ScopeStatic scopingLocal understanding of function behavior

Know at compile-time what each name refers to

A bit trickier to implementDynamic scopingCan be hard to understand behavior of functionsRequires finding name bindings at runtime

Easier to implement (just keep a global table of stacks of variable/value bindings )27CMSC 33028NamespacesLanguages have a top-level or outermost scopeMany things go in this scope; hard to control collisionsCommon solution: add a hierarchyOCaml: ModulesList.hd, String.length, to add names into current scopeJava: Packagesjava.lang.String, java.awt.Point, etc.import to add names into current scopeC++: Namespacesnamespace f { class g { ... } }, f::g b, etc.using namespace to add names to current scope28CMSC 33029Free and Bound VariablesThe bound variables of a scope are those names that are declared in itIf a variable is not bound in a scope, it is freeThe bindings of variables which are free in a scope are "inherited" from declarations of those variables in outer scopes in static scoping{ /* 1 */ int j;

{ /* 2 */ float i;

j = (int) i; }}i is bound inscope 2j is free inscope 2j is bound inscope 129Implementation:Evaluation OrderCMSC 330CMSC 33031Call-by-Value (CBV)In call-by-value (eager evaluation), arguments to functions are fully evaluated before the function is invokedThis is the standard evaluation order that we're used to from C, C++, and JavaAlso in OCaml, in let x = e1 in e2, the expression e1 is fully evaluated before e2 is evaluated31CMSC 33032Call-by-Value in Imperative LanguagesIn C, C++, and Java, call-by-value has another featureWhat does this program print?

Prints 0void f(int x) { x = 3;}

int main() { int x = 0; f(x); printf("%d\n", x);}32CMSC 33033Call-by-Value in Imperative Languages, con't.Actual parameter is copied to stack location of formal parameter

Modification of formal parameter not reflected in actual parameter!int main() { int x = 0; f(x); printf("%d\n", x);}x0void f(int x) { x = 3;}x0333CMSC 33034Call-by-Reference (CBR)Alternative idea: Implicitly pass a pointer or reference to the actual parameterIf the function writes to it the actual parameter is modifiedint main() { int x = 0; f(x); printf("%d\n", x);}void f(int x) { x = 3;}x0x334CMSC 33035Call-by-Reference (contd)AdvantagesThe entire argument doesn't have to be copied to the called functionIt's more efficient if youre passing a large (multi-word) argumentAllows easy multiple return valuesDisadvantagesCan you pass a non-variable (e.g., constant, function result) by reference?It may be hard to tell if a function modifies an argumentWhat if you have aliasing?35CMSC 33036AliasingWe say that two names are aliased if they refer to the same object in memoryC examples (this is what makes optimizing C hard)int x;int *p, *q; /*Note that C uses pointers to simulate call by reference */p = &x; /* *p and x are aliased */q = p; /