Daily scala: Calling Java APIs

Monday, August 10, 2009

Calling Java APIs

Calling Java APIs from Scala is completely seamless. I will demonstrate this functionality by copying data from a URL to a file and then making a copy of that file.

Important Note: Scala 2.8 is getting a redesigned API for accessing files and streams based on JSR-203 New NIO. It is quite nice to use. For example File("/tmp") / "dir" / "dir2" will create a file to /\
tmp/dir/dir2. That is just the most basic of what you can expect. I will do a couple topics on that when it gets closer to being finalized.

scala> import java.net._
import java.net._
scala> import scala.io._
import scala.io._
scala> import java.io.{File, FileWriter}
import java.io.{File, FileWriter}
scala> val in = Source.fromURL("http://www.google.com")
in: scala.io.Source = non-empty iterator
scala> // Until scala 2.8 we have to use the standard java streams and idioms
scala> val out = new FileWriter("/tmp/daily-scala")
out: java.io.FileWriter = java.io.FileWriter@71d0e17a
scala> try {
    | out.write( in.getLines.mkString("\n") )
    | }finally{
    | out.close
    | }
scala> // now lets copy the file
scala> val copy = new FileWriter("/tmp/copy")
copy: java.io.FileWriter = java.io.FileWriter@7bfd25ce
scala> try {
    | copy.write( Source.fromFile("/tmp/daily-scala").getLines.mkString("\n") )
    | } finally {copy.close}
scala> val copy2 = new FileWriter("/tmp/copy2")
copy2: java.io.FileWriter = java.io.FileWriter@7bfd25ce
// You can reuse a source if you reset it
scala> try {
    | copy2.write( in.reset.getLines.mkString("\n") )
    | } finally {copy2.close}
// Change all 'e' to upper case.  We could write this to a file if we desired
scala> in.reset.getLines.mkString("\n").map( c => if (c == 'e') c.toUpperCase else c).mkString("")
res9: String = This is thE dEmo filE

4 comments:

UnknownAugust 13, 2009 at 6:40 AM
Hi Jesse

Thanks a lot for this valuable sample!

I pushed the sample further and tried to go for binary copies in scala.

So after a while I got the following snippet to do the job in scala:

[code]

import java.io._

val src = "./test.txt"
val dst = "/home/simon/workspace/scala/test_copy.txt"

val in = new FileInputStream(src);
val out = new FileOutputStream(dst);

// Transfer bytes from in to out
val buf = new Array [byte](1024);
var len = 0;

len = in.read(buf)
while ( len > 0) {
out.write(buf, 0, len)
len = in.read(buf)
}

in.close
out.close

[/code]

Now I feel I could do better: I don't like this additional line:
len = in.read(buf)
at the beginning of the loop.

This looks as if with Java it'd be less code and more elegant because then I could leave out that line
and write the while statement like this:

while ((len = in.read(buf)) > 0) {

This compiles and even runs under Scala but for known reasons I get a zero-Length copy
plus a
"comparing values of types Ordered[Unit] and Unit using `>' will always yield false" -Warning

So how could I do better?

Thanks a lot
Patrick
ReplyDelete
Replies
UnknownAugust 13, 2009 at 7:48 AM
I have a couple points for you. First = in scala returns Unit not the assigned value. So you cannot do:

val x = y = 1

this obviously disallows while ((len = in.read(buf)) > 0)

I can't honestly remember the reasoning behind this.

Right now I would not obsess about the best way to copy in Scala because the API is dramatically changing in Scala 2.8 for example:

Source.fromFile("xyz") pumpto Sink.fromFile("out")

I don't know the API yet but that is the sort of think you can expect.

The reason you get a zero length copy is because you are comparing 0 and Unit which is always false. (len = in.read(buf)) returns Unit.

As for a way to do this that is more "functional" would be to use recursion. It is totally unnecessary but if you want to learn to program in a functional manner then you would probably do:
import java.io._

val in = new FileInputStream("/tmp/in")
val out = new FileOutputStream("/tmp/out")

def copy( in:FileInputStream, out:FileOutputStream ):Unit = {
val buf = new Array [byte](1024)
val len = in.read(buf)
if( len > 0 ) {
out.write(buf, 0, len)
copy(in,out)
}
}

copy(in,out)
in.close()
out.close()

I have a bug in this little program that I dont have time to figure out. It should be turned into a loop thanks to tail recursion so you dont get a stack overflow. I think I may have to ask the scala list.
ReplyDelete
Replies
UnknownAugust 14, 2009 at 12:11 AM
Ah I figured out the bug (with some help). The copy method could be overridden by a subclass so it can't be optimized. It has to be final or it has to be an inner function:

def copyFile(..)={
def copy(...){...}
}

of final:

final def copy(..){...}

I made that change compiled it and then used jd-gui to look at the byte code and it is transformed into a for loop. Although I would still like to do a speed test to see if the multiple allocations of the array is expensive. I have been told that the JVM should reuse the same object so the difference should be minimal.
ReplyDelete
Replies
Henry BuckleyOctober 19, 2012 at 8:14 AM
I think the final 'mkString("")' is redundant in this line:

scala> in.reset.getLines.mkString("\n").map( c => if (c == 'e') c.toUpperCase else c).mkString("")
ReplyDelete
Replies

Add comment