Friday, October 30, 2009

Groupby - collection processing

Iterator and Iterable have most of the most useful methods when dealing with collections. Fold, Map, Filter are probably the most common. But other very useful methods include grouped/groupBy, sliding, find, forall, foreach, and many more. I want to cover Iterable's groupBy method in this topic.

This is a Scala 2.8 and later method. It is similar to partition in that it allows the collection to be divided (or partitioned). Partition takes a method with returns a boolean and partitions the collection into two depending on a result. GroupBy takes a function that returns an object and returns a Map with the key being the return value. This allows an arbitrary number of partitions to be made from the collection.

Here is the method signature:
  1. def groupBy[K](f : (A) => K) : Map[K, This]

A bit of context is require to understand the three Type parameters A, K and This. This method is defined in a super class of collections called TraversableLike (I will briefly discuss this in the next topic.) TraversableLike takes two type parameters: the type of the collection and the type contained in the collection. Therefore in this method definition, 'This' refers to the collection type (List for example) and A refers to contained type (perhaps Int). Finally K refers to the type returned by the function and are the keys of the groups formed by the method.

Examples:
  1. scala> val groups = (1 to 20).toList groupBy {
  2.      | case i if(i<5) => "g1"
  3.      | case i if(i<10) => "g2"
  4.      | case i if(i<15) => "g3"
  5.      | case _ => "g4"
  6.      | }
  7. res4: scala.collection.Map[java.lang.String,List[Int]] = Map(g1 -> List(1, 2, 3, 4), g2 -> List(5, 6, 7, 8, 9), g3 -> List(10, 11, 12, 13, 14), g4 -> List(15, 16, 17, 18, 19, 20))
  8. scala> groups.keySet
  9. res6: scala.collection.Set[java.lang.String] = Set(g1, g2, g3, g4)
  10. scala> groups("g1")
  11. res7: List[Int] = List(1, 2, 3, 4)
  12. scala> val mods = (1 to 20).toList groupBy ( _ % 4 )
  13. mods: scala.collection.Map[Int,List[Int]] = Map(1 -> List(1, 5, 9, 13, 17), 2 -> List(2, 6, 10, 14, 18), 3 -> List(3, 7,
  14.  11, 15, 19), 0 -> List(4, 8, 12, 16, 20))
  15. scala> mods.keySet
  16. res9: scala.collection.Set[Int] = Set(1, 2, 3, 0)
  17. scala> mods(1)
  18. res11: List[Int] = List(1, 5, 9, 13, 17)

Thursday, October 29, 2009

Boolean Extractors

As discussed in other topics there are several ways to create custom extractors for objects. See
There is one more custom extractor that can be defined. Simple boolean extractors. They are used as follows:
  1. scala>"hello world"match {                   
  2.      | case HasVowels() => println("Vowel found")
  3.      | case _ => println("No Vowels")            
  4.      | }
  5. Vowel found

A boolean extractor is an object that returns Boolean from the unapply method rather than Option[_].

Examples:
  1. scala>object HasVowels{ defunapply(in:String):Boolean = in.exists( "aeiou" contains _ ) }
  2. defined module HasVowels
  3. // Note that HasVowels() has ().
  4. // This is critical because otherwise the match checks whether
  5. // the input is the HasVowels object.
  6. // The () forces the unapply method to be used for matching
  7. scala>"hello world"match {
  8.      | case HasVowels() => println("Vowel found")
  9.      | case _ => println("No Vowels")
  10.      | }
  11. Vowel found
  12. // Don't forget the ()
  13. scala>"kkkkkk"match {    
  14.      | case HasVowels() => println("Vowel found")
  15.      | case _ => println("No Vowels")
  16.      | }
  17. No Vowels
  18. scala>class HasChar(c:Char) {           
  19.      | defunapply(in:String) = in.contains(c)
  20.      | }
  21. defined class HasChar
  22. scala>val HasC = new HasChar('c')
  23. HasC: HasChar = HasChar@3f2f529b
  24. // Don't forget the () it is required here as well
  25. scala>"It actually works!"match {
  26.      | case HasC() => println("a c was found")
  27.      | case _ => println("no c found")  
  28.      | }
  29. a c was found
  30. // Don't forget the ()
  31. scala>"hello world"match { 
  32.      | case HasC() => println("a c was found")
  33.      | case _ => println("no c found")       
  34.      | }
  35. no c found

Wednesday, October 28, 2009

Extractors 3 (Operator Style Matching)

One of the blessings and curses of Scala the several rules for creating expressive code (curse because it can be used for evil.) One such rule is related to extractors that allows the following style of match pattern:
  1. List(1,2,3) match {
  2.   case 1 :: _ => println("found a list that starts with a 1")
  3.   case _ => println("boo")
  4. }

The rule is very simple. An extractor object that returns Option[Tuple2[_,_]] (or equivalently Option[(_,_)]) can be expressed in this form.

In other words: object X {defunapply(in:String):Option[(String,String)] = ...} can be used in a case statement like: case first X head => ... or case"a" X head => ....

Example to extract out the vowels from a string:
  1. scala>object X { defunapply(in:String):Option[(RichString,RichString)] = Some(in.partition( "aeiou" contains _ )) }
  2. defined module X
  3. scala>"hello world"match { case head X tail => println(head, tail) }
  4. (eoo,hll wrld)
  5. // This is equivalent but a different way of expressing it
  6. scala>"hello world"match { case X(head, tail) => println(head, tail) }       
  7. (eoo,hll wrld)


Example for Matching the last element in a list. Thanks to 3-things-you-didnt-know-scala-pattern.html:
  1. scala>object ::> {defunapply[A] (l: List[A]) = Some( (l.init, l.last) )}
  2. defined module $colon$colon$greater
  3. scala> List(1, 2, 3) match {
  4.      | case _ ::> last => println(last)
  5.      | }
  6. 3
  7. scala> (1 to 9).toList match {
  8.      | case List(1, 2, 3, 4, 5, 6, 7, 8) ::> 9 => "woah!"
  9.      | }
  10. res12: java.lang.String = woah!
  11. scala> (1 to 9).toList match {
  12.      | case List(1, 2, 3, 4, 5, 6, 7) ::> 8 ::> 9 => "w00t!"
  13.      | }
  14. res13: java.lang.String = w00t!

Thursday, October 1, 2009

Vacation

For the next three weeks (Oct 3,2009 to Oct 25) I am going on vacation which means my Internet access will not be regular so I will post when I can but I cannot guarantee any real schedule. So to keep you interested I recommend looking at this very nice overview of Scala's features:

Another Scala Tour

I really liked the presentation on this site.

Jesse

Strings

This topic simply shows several things that can be done with strings. It is not exhaustive and focusses of things that cannot as easily be done with java strings.

Note: Because I am using Scala 2.7 we often need to use mkString to convert the processed string from a sequence of characters to a string. In scala 2.8 this is not required.

  1. /*
  2.    Making use of raw strings to create a multi line string
  3.    I add a | at the beginning of each line so that we can line up the quote nicely 
  4.    in source code then later strip it from the string using stripMargin
  5. */
  6. scala>val quote = """|I don-t consider myself a pessimist.                                                                                                
  7.      |                |I think of a pessimist as someone who is waiting for it to rain.
  8.      |                |And I feel soaked to the skin.
  9.      | 
  10.      |                |Leonard Cohen"""
  11. quote: java.lang.String = 
  12. |I don-t consider myself a pessimist. 
  13.                       |I think of a pessimist as someone who is waiting for it to rain.
  14.                       |And I feel soaked to the skin.
  15.        
  16.                       |Leonard Cohen
  17. // capilize the first character of each line
  18. scala>val capitalized = quote.lines.
  19.      |                         map( _.trim.capitalize).mkString("\n")
  20. capitalized: String = 
  21. |I don-t consider myself a pessimist.
  22. |I think of a pessimist as someone who is waiting for it to rain.
  23. |And I feel soaked to the skin.
  24. |Leonard Cohen
  25. // remove the margin of each line
  26. scala> quote.stripMargin        
  27. res1: String = 
  28. I don-t consider myself a pessimist. 
  29. I think of a pessimist as someone who is waiting for it to rain.
  30. And I feel soaked to the skin.
  31.        
  32. Leonard Cohen
  33. // this is silly.  I reverse the order of each word but keep the words in order
  34. scala> quote.stripMargin.         
  35.      |       lines.               
  36.      |       map( _.split(" ").   
  37.      |              map(_.reverse).
  38.      |              mkString (" ")).
  39.      |      mkString("\n")
  40. res16: String = 
  41. I t-nod redisnoc flesym a .tsimissep
  42. I kniht fo a tsimissep sa enoemos ohw si gnitiaw rof ti ot .niar
  43. dnA I leef dekaos ot eht .niks
  44. dranoeL nehoC
  45. scala>val myPatch = "-->This is my patch<--"                
  46. myPatch: java.lang.String = -->This is my patch<--
  47. // I replace the characters from point 10 in the string to myPatch.length 
  48. // (the full patch string)
  49. scala> quote.patch(10, myPatch, myPatch.length).mkString     
  50. res21: String = 
  51. |I don-t c-->This is my patch<--mist.
  52.                       |I think of a pessimist as someone who is waiting for it to rain.
  53.                       |And I feel soaked to the skin.
  54.        
  55.                       |Leonard Cohen
  56. // drop the first 3 lines of the string.  
  57. // there is also a take method
  58. scala> quote.lines.drop(3).mkString("\n").stripMargin 
  59. res25: String = 
  60.        
  61. Leonard Cohen
  62. // a bunch of examples of converting strings
  63. scala>"1024".toInt
  64. res26: Int = 1024
  65. scala>"1024".toFloat
  66. res27: Float = 1024.0
  67. scala>"1024".toDouble
  68. res28: Double = 1024.0
  69. scala>"1024".toLong  
  70. res29: Long = 1024
  71. scala>"102".toByte 
  72. res31: Byte = 102
  73. scala>"true".toBoolean
  74. res32: Boolean = true
  75. // format uses the java.util.Formatter class to format a string
  76. scala>"Hello %s,\nThe date is %2$tm %2$te,%2$tY".format("Jesse", new java.util.Date()) 
  77. res35: String = 
  78. Hello Jesse,
  79. The date is 09 30,2009
  80. /*
  81.    More silliness
  82.    I am replacing every other character with the character of the reversed string
  83.   
  84.    this is done by 
  85.    1. convert string to a list and zip it together with its reverse
  86.       We may still need to cover zipping.  It basically matches up the 
  87.       corresponding elements of two lists into one list
  88.       so 1,2,3 and one,two,three zipped would be (1,one),(2,two),(3,three)
  89.    2. Add an index to each element in the list with zipWithIndex
  90.    3. Use map to check if the element is an odd element using the index and return either the original element or the reversed element
  91.   
  92.    Not useful but interesting use of functional idioms
  93. */
  94. scala> quote.toList.                                          
  95.      |       zip(quote.reverse.toList).                       
  96.      |       zipWithIndex.                                    
  97.      |       map {                                            
  98.      |            case ((original,reversed),index) if(index % 2 == 0) => original
  99.      |            case ((original,reversed),index) => reversed                   
  100.      |           }.                                                              
  101.      |       mkString                                                            
  102. res42: String = |e oo -r noes|d r m s l     e s m s .        . i s e t o   e|a shlne  f anp|s i i t a   o e n   h  .siwrioi gifrrfig ioirwis.  h   n e o   a t i i s|pna f  enlhs a|e   o t e s i .        . s m s e     l s m r d|seon r- oo e|
  103. // filter out all non-vowels
  104. scala> quote.filter( "aeiou" contains _ ).mkString
  105. res51: String = ooieeaeiiioaeiiaoeoeoiaiioioaieeoaeoeieoaoe

Multiple type bounds on a generic parameter

Suppose you want to declare a method to take objects that implement two interfaces or parent objects. For example suppose the parameter must implement both Iterable and a function so that you can access the elements of the iterable via object(index). How can you do that in scala?

This topic is derived from How do I setup multiple type bounds in Scala?

Answer:
  1. scala>def process[R <: Function1[Int,String]with Iterable[String]] (resource:R) = {
  2.      | println("using function:"+resource(0))
  3.      | println("using iterable:"+resource.elements.next)
  4.      | }
  5. process: [R <: (Int) => String with Iterable[String]](R)Unit
  6. // Array is a Function1 and iterable so this works
  7. scala> process (Array("1","2","3"))                                                  
  8. using function:1
  9. using iterable:1