open-source, performance

Load testing Apache Thrift

TL;DR: Use massive-attack

We are currently experimenting with integrating Apache Thrift into one of our Finatra-based APIs.

There was a bit of learning curve involved in this, as firstly most of our APIs use Akka HTTP, and also we had not utilised any RPC frameworks before. As part of the prototype, I created a simple Finatra API which had two endpoints that returned static responses: one over HTTP, and the other using Thrift. This is quite simple to do, after you figure out which plugins to use to generate the code based on the Thrift Interface Definition Language (IDL).

It took probably a day to set this up and deploy it to AWS – but then came the realisation that Thrift might simplify a lot of things, but load testing your endpoints is not one of them.

Because of how Thrift works, you would need to create a client to return your data; this is basically just a method. For example if you have created MyService Thrift service in API #1, you would simply create a client in API #2 which needs the data provided in MyService by:

lazy val thriftHost = "localhost:9911"

lazy val thriftClient: MyService.MethodPerEndpoint = 
  Thrift.client.build[MyService.MethodPerEndpoint](thriftHost)

API #2 can then surface the Thrift data from API #1 in JSON (or any other format) in an HTTP endpoint:

I then created Gatling load test scenarios for API #2 which load tested two endpoints: the first one powered by the API #1‘s HTTP endpoint, and the second one powered by API #1‘s Thrift endpoint.

The load tests ran fine, and Thrift-powered endpoint was faster as expected. But the problem with load testing this way is that you are basically load testing HTTP endpoints of API #2, not the Thrift endpoints of API #1, and it didn’t give me a clear idea of how many requests the Thrift endpoint can really handle when accessed from multiple APIs.

The next logical step was to look for load/performance testing tools that were capable of testing Thrift endpoints directly. This proved much more difficult than I expected; strictly speaking there are tools that can do this, but there were three big problems with them:

  1. They were quite complicated to use
  2. They were not updated in some cases in years
  3. And most importantly, they did not provide me with the information I wanted a load test tool to provide me with, such as how many RPS the Thrift endpoint can handle, and how fast.

In this process I experimented with Pinterest’s Bender, Twitter’s iago, and even tried writing my own JMeter Thrift plugin by following an obscure tweet down the rabbit hole.

Eventually all of these (failed) attempts made me think that load testing a Thrift endpoint cannot and definitely should not be this difficult. So, I started writing my own simple load testing tool, and called it simple-load-test. I eventually changed the name to massive-attack, which was brilliantly suggested to me by my colleague Michael; a bit of background: in my team we name our APIs after bands, which is incredibly confusing, but fun.

The concept behind massive-attack is quite simple: you can load test any method which returns a Scala (or Twitter) Future, and it will tell you the response times for that method after calling the method for the specified number of times or duration. You can do this as part of your normal unit/integration tests – I might change this later to implement the SBT’s test interface, but it works perfectly fine when added to Specs2 or ScalaTest scenarios.

For example to load test a Thrift endpoint,  you add the following to your test specs:

"Thrift endpoint" should {
  "provide average response times of less than 40ms" in {

    lazy val thriftHost = "localhost:9911"

    lazy val thriftClient: MyService.MethodPerEndpoint = 
      Thrift.client.build[MyService.MethodPerEndpoint](thriftHost)

    val testProperties = MethodPerformanceProps(
      invocations = 10000,
      duration = 300
    )

    val methodPerformance = new MethodPerformance(testProperties)

    val testResultF: Future[MethodPerformanceResult] = 
  methodPerformance.measure(() => thriftClient.programmes())

    val testResult = Await.result(testResultF, futureSupportTimeout)
  
    testResult.averageResponseTime must beLessThanOrEqualTo(40)
  }
}

This will call your thrift endpoint called “programmes” 10,000 times (or for 5 minutes, whichever come first) and asserts the average response times to be less than 40ms.

You can do assertions based on any of the properties returned as part of the test result. At the moment, following are supported:

  • Minimum response times (ms)
  • Maximum response times (ms)
  • 95 percentile response times (ms)
  • 99 percentile response times (ms)
  • Average response times (ms)
  • Number of invocations
  • Average requests (RPS)
  • Minimum requests (RPS)
  • Maximum requests (RPS)
  • Number of spikes
  • Percentage of spikes
  • Boundary that response is consider a spike

As you can tell, you can test any method this way – even HTTP endpoints:

...
  val httpClient: HttpClient = new HttpClient()
  val httpRequest: httpClient.RequestBuilder = 
    httpClient.get("http://0.0.0.0:8080/programmes")

   val testResultF: Future[MethodPerformanceResult] = 
     methodPerformance.measure(() => httpRequest.execute())
   ...      

As part of setting the test properties, you can also specify on how many threads you want to call your function – this is useful for HTTP/normal methods, but not so much for Thrift endpoints as the client runs only on one thread and calling it from multiple threads causes problems.

This library still needs a lot of work and fine-tuning, but the first version is now available through Maven – and more improvements will follow soon.

 

open-source, sbt-plugin

Tracing usage of your Scala library

Finding where your code is used across multiple projects in a big code base or organisation can be quite difficult, especially if you have made a change that needs to be propagated to every application that uses the updated client or library.

I have created an open source SBT plugin that can simplify this process a bit, more details can be found in the sbt-trace page.

vs.

Maps or For Comprehension?

Maps are a powerful tool in Scala, allowing you to apply a function to each item in a collection; however they sometimes can be a bit difficult to understand, especially for programmers who are new to Scala or functional programming in general.

As mentioned previously, one of the strength of Scala lies in the fact that it allows different solutions to the same problem. For example, given we have a list of Strings and want to transform each member to uppercase:

val listOfItems: List[String] = List("first", "second", "third")

We can simply do this using maps:

val upperCaseListOfItems: List[String] = listOfItems map (_.toUpperCase)

Which would result in:

upperCaseListOfItems: List[String] = List(FIRST, SECOND, THIRD)

However, we could have as easily used for comprehension:

val upperCaseListOfItems: List[String] = for {
  item <- listOfItems
} yield item.toUpperCase

Maps might be more functional, but I have always found for comprehensions easier to use and understand. It basically depends on how you learnt about Scala and your background, and which of these two methods you learnt about first!

I have always believed that what matters a lot is that the code is readable to other engineers, and for comps certainly achieve that.

vs.

Maps vs. Pattern Matching

One of the things I like about Scala is that it allows you to do the same thing, many different ways. Because of this, there is always the challenge of making the code more efficient; hopefully not at the expense of readability.

Take for example the concept of pattern matching in Scala. If we have:

case class Availability(startDate: Option[String])

val availability = Availability(Some("2016-05-06T09:00:00"))

And want to write a method which retrieves the startDate from an optional Availability, short of using if/else conditions, pattern matching would be the easiest way:

def getStartDate(available: Option[Availability]): Option[String] = available match {
  case Some(x) => x.startDate
  case None => None
}

Then if we call:

getStartDate(Some(availability))

We would get:

res0: Option[String] = Some(2016-05-06T09:00:00)

But this is not the most efficient way, and there is a much simpler way of getting the same result using maps:

def getStartDate(available: Option[Availability]): Option[String] = 
  available map (_.startDate) getOrElse None

Which can even be further simplified to:

def getStartDate(available: Option[Availability]): Option[String] = 
  available flatMap (_.startDate)

Personally I prefer the solution with flatMap, but because of the flexibility that Scala provides, you can always choose any of these options to get the same result.

 

json4s

json4s Custom Serializers

I recently had to work with custom serializers, which were interesting to say the least. I had two case classes and a trait in the following formats:

trait Parent
case class ChildClassOne(kind: String = "first_type", id: String) extends Parent
case class ChildClassTwo(kind: String = "second_type", id: String) extends Parent

And another case class which contained a list of Parents:

case class ParentResponse(total: Int, results: List[Parent])

Basically the json response might have a list of objects which can either be of type ChildClassOne or ChildClassTwo.

Because of this (I thought) I needed to create a custom serializer:

class ParentSerializer extends CustomSerializer[Parent](format => ( {
    case JObject(List(JField("kind", JString(kind)), JField("id", JString(id)))) 
        if kind == "first_type" => ChildClassOne(kind, id) 
    case JObject(List(JField("kind", JString(kind)), JField("id", JString(id)))) 
        if kind == "second_type" => ChildClassTwo(kind, id) 
  }, {
    case _ => null
  }))

This worked fine. Problem was that these objects might get quite big and I didn’t want to specify every single field in custom serializer. I also was not modifying the properties in any way, and was using the custom serializer just to return the right type of case class based on the kind field.

After trying different ways, I found doing it the following way to be the best solution using a Serializer:

trait Parent
case class ChildClassOne(kind: String = "first_type", id: String) extends Parent
case class ChildClassTwo(kind: String = "second_type", id: String) extends Parent

case class ParentResponse(total: Int, results: List[Parent])

class ParentSerializer extends Serializer[Parent] {
    private val ParentClass = classOf[Parent]
    implicit val formats = DefaultFormats

    def deserialize(implicit format: Formats): PartialFunction[(TypeInfo, JValue), Parent] = {
      case (TypeInfo(ParentClass, _), json) => json match {
        case JObject(JField("kind", JString(kind)) :: _) => kind match {
          case "first_type" => json.extract[ChildClassOne]
          case "second_type" => json.extract[ChildClassTwo]
        }

        case _ => throw new MappingException("Invalid kind")
      }
    }

    def serialize(implicit format: Formats): PartialFunction[Any, JValue] = Map()
  }

implicit val formats = DefaultFormats + new ParentSerializer
Akka

Akka

To create actors:
  1. Create ActorSystem
  2. Create an actor which extends untyped actor
  3. Create ActorRef which has System.actorOf step 2 actor
  4. Tell (fire & forget) or Ask ActorRef messages
  • Order of messages are retained per sender and receiver
  • Supervisor detects and responds to the failures of actors. This means if an actor crashes, a notification will be sent to its supervisor, which can decide what to do about it. This provides separation of processing and error handling
Uncategorized

Abstract Members

A member of a class or trait is abstract if the member does not have a complete definition in the class. Abstract members are intended to be implemented in subclasses of the class in which they are declared. Unlike Java, besides methods, you can also declare abstract fields and even abstract types as members of classes and traits.
  • Traits by definition are abstract
  • Types: one reason to use a type member is to define a short, descriptive alias for a type whose real name is more verbose, or less obvious in meaning, than the alias. Such type members can help clarify the code of a class or trait. The other main use of type members is to declare abstract types that must be defined in subclasses
  • It is OK to override a ‘def’ with a ‘val’ in the child class, however you cannot override a ‘val’ with a ‘def’
  • vars declared as members of classes come equipped with getter and setter methods. This holds for abstract vars as well. If you declare an abstract var named hour, for example, you implicitly declare an abstract getter method, hour, and an abstract setter method, hour_=
  • Pre-initialized fields let you initialize a field of a subclass before the superclass is called:
    • Pre-initialized fields in an anonymous class expression:
new {
     val numerArg = 1 * x
     val denomArg = 2 * x
} with RationalTrait
  • Pre-initialized fields in an object definition:
object twoThirds extends {
    val numerArg = 2
    val denomArg = 3
} with RationalTrait
  • Lazy vals: sometimes you might prefer to let the system itself sort out how things should be initialized. This can be achieved by making your val definitions lazy. If you prefix a val definition with a lazy modifier, the initialising expression on the right-hand side will only be evaluated the first time the val is used. This is similar to the situation where x is defined as a parameterless method, using a def. However, unlike a def a lazy val is never evaluated more than once. In fact, after the first evaluation of a lazy val the result of the evaluation is stored, to be reused when the same val is used subsequently
Uncategorized

Type Parameterization

Type parameterization allows you to write generic classes and traits. For example, sets are generic and take a type parameter: they are defined as Set[T]. As a result, any particular set instance might be a Set[String], a Set[Int], etc.—but it must be a set of something.
Functional queues
head returns the first element of the queue
tail returns a queue without its first element
enqueue returns a new queue with a given element appended at the end
scala> val q = Queue(1, 2, 3)
q: Queue[Int] = Queue(1, 2, 3)

scala> val q1 = q enqueue 4
q1: Queue[Int] = Queue(1, 2, 3, 4)

scala> q
res0: Queue[Int] = Queue(1, 2, 3)
Information hiding with private constructors
It is still possible to hide the primary constructor by adding aprivatemodifier in front of the class parameter list.
class Queue[T] private (
          private val leading: List[T],
          private val trailing: List[T]
)
The private modifier between the class name and its parameters indi- cates that the constructor of Queue is private: it can be accessed only from within the class itself and its companion object. The class name Queue is still public, so you can use it as a type, but you cannot call its constructor:
scala> new Queue(List(1, 2), List(3))
<console>:6: error: constructor Queue cannot be accessed in object $iw
               new Queue(List(1, 2), List(3))
               ˆ
Now that the primary constructor of class Queue can no longer be called from client code, there needs to be some other way to create new queues. One possibility is to add an auxiliary constructor, like this:
def this() = this(Nil, Nil)
The auxiliary constructor shown in the previous example builds an empty queue. As a refinement, the auxiliary constructor could take a list of initial queue elements:
def this(elems: T*) = this(elems.toList, Nil)
T* is the notation for repeated parameters.
Another possibility is to add a factory method that builds a queue from such a sequence of initial elements. A neat way to do this is to define an object Queue that has the same name as the class being defined and contains an apply method:
object Queue {
     // constructs a queue with initial elements ‘xs’
     def apply[T](xs: T*) = new Queue[T](xs.toList, Nil)
}
Variance Annotations
 
Prefixing a formal type parameter with a+indicates that subtyping is co- variant (flexible) in that parameter. By adding this single character, you are telling Scala that you wantQueue[String], for example, to be considered a subtype ofQueue[AnyRef]. The compiler will check thatQueueis defined in a way that this subtyping is sound.
Besides +, there is also a prefix -, which indicates contravariant sub-typing. IfQueuewere defined like this:
trait Queue[-T] { … }
then ifTis a subtype of typeS, this would imply thatQueue[S]is a sub- type ofQueue[T](which in the case of queues would be rather surprising!). Whether a type parameter is covariant, contravariant, or nonvariant is called the parameter’svariance. The+and-symbols you can place next to type parameters are calledvariance annotations.
Lower bounds
You can generalize enqueue by making it polymorphic (i.e., giving the enqueue method itself a type pa- rameter) and using a lower bound for its type parameter:
class Queue[+T] (private val leading: List[T], private val trailing: List[T] ) {
    def enqueue[U >: T](x: U) =
      new Queue[U](leading, x :: trailing) // ...
}
The new definition gives enqueue a type parameter U, and with the syntax, “U >: T”, defines T as the lower bound for U. As a result, U is required to be a supertype of T.1 The parameter to enqueue is now of type U instead of type T, and the return value of the method is now Queue[U] instead of Queue[T]. As an example, suppose there is a class Fruit with two subclasses, Apple and Orange. With the new definition of class Queue, it is possible to append an Orange to a Queue[Apple]. The result will be a Queue[Fruit].
Liskov Substitution Principle
It is safe to assume that a typeTis a subtype of a typeUif you can sub- stitute a value of type T wherever a value of type U is required.
Uncategorized

Collections

Sequences
Sequences types let you work with groups of data lined up in order. Because the elements are ordered, you can ask for the first element, second element, 103rd element, and so on.
Lists
 
Lists sup- port fast addition and removal of items to the beginning of the list, but they do not provide fast access to arbitrary indexes because the implementation must iterate through the list linearly.
The immutability of lists helps you develop correct, efficient algorithms because you never need to make copies of a list. Described separately in another note.
Arrays
Arrays allow you to hold a sequence of elements and efficiently access an element at an arbitrary position, both to get or update the element, with a zero-based index. Here’s how you create an array whose size you know, but for which you don’t yet know the element values:
scala> val fiveInts = new Array[Int](5)
fiveInts: Array[Int] = Array(0, 0, 0, 0, 0)
Here’s how you initialize an array when you do know the element values:
scala> val fiveToOne = Array(5, 4, 3, 2, 1)
fiveToOne: Array[Int] = Array(5, 4, 3, 2, 1)
List buffers
ClassListprovides fast access to the head of the list, but not the end. Thus, when you need to build a list by appending to the end, you should consider building the list backwards by prepending elements to the front, then when you’re done, callingreverseto get the elements in the order you need. Another alternative, which avoids thereverseoperation, is to use aListBuffer. AListBufferis a mutable object (contained in packagescala.collection.mutable), which can help you build lists more effi- ciently when you need to append.ListBufferprovides constant time ap- pend and prepend operations. You append elements with the+=operator, and prepend them with the+=:operator. When you’re done building, you can obtain aListby invokingtoListon theListBuffer.
scala> val buf = new ListBuffer[Int]
buf: scala.collection.mutable.ListBuffer[Int] = ListBuffer()

scala> buf += 1
res4: buf.type = ListBuffer(1)

scala> buf += 2
res5: buf.type = ListBuffer(1, 2)

scala> 3 +=: buf
res7: buf.type = ListBuffer(3, 1, 2)

scala> buf.toList
res8: List[Int] = List(3, 1, 2)

Array buffers

An ArrayBuffer is like an array, except that you can additionally add and remove elements from the beginning and end of the sequence. All Array operations are available, though they are a little slower due to a layer of wrapping in the implementation. The new addition and removal operations are constant time on average. When you create an ArrayBuffer, you must specify a type parameter, but need not specify a length. The ArrayBuffer will adjust the allocated space automatically as needed.
scala> val buf = new ArrayBuffer[Int]()
buf: scala.collection.mutable.ArrayBuffer[Int] =ArrayBuffer()

scala> buf += 12
res9: buf.type = ArrayBuffer(12)

scala> buf += 15
res10: buf.type = ArrayBuffer(12, 15)
Strings (via StringOps)
Because Predef has an implicit conversion from String to StringOps, you can treat any string like a sequence:
scala> def hasUpperCase(s: String) = s.exists(_.isUpper)
hasUpperCase: (s: String)Boolean

scala> hasUpperCase("Robert Frost")
res14: Boolean = true

scala> hasUpperCase("e e cummings")
res15: Boolean = false
Because no method named “exists” is declared in class String itself, the Scala compiler will implicitly convert s to StringOps, which has the method. The exists method treats the string as a sequence of characters.
Sets and Maps
Big difference between Sets and Lists is that Sets members are unique. By default when you write “Set” or “Map” you get an immutable object. If you want the mutable variant, you need to do an explicit import. If you want to use both mutable and immutable sets or maps in the same source file, one approach is to import the name of the package that contains the mutable variants:
scala> import scala.collection.mutable
import scala.collection.mutable
You can continue to refer to the immutable set as Set, as before, but can now refer to the mutable set as mutable.Set. Here’s an example:
scala> val mutaSet = mutable.Set(1, 2, 3)
mutaSet: scala.collection.mutable.Set[Int] = Set(3, 1, 2)
The key characteristic of sets is that they will ensure that at most one of each object, as determined by ==, will be contained in the set at any one time.  Because sets exclude duplicates, each distinct word will appear exactly one time in the set:
scala> val text = "See Spot run. Run, Spot. Run!"
text: java.lang.String = See Spot run. Run, Spot. Run!

scala> val wordsArray = text.split("[ !,.]+")
wordsArray: Array[java.lang.String]= Array(See, Spot, run, Run, Spot, Run)

scala>  val words = mutable.Set.empty[String]
words: scala.collection.mutable.Set[String] = Set()

scala> for (word <- wordsArray)
         words += word.toLowerCase
scala> words
res17: scala.collection.mutable.Set[String]= Set(spot, run, see)
Maps let you associate a value with each element of the collection. Using a map looks similar to using an array, except that instead of indexing with integers counting from 0, you can use any kind of key. If you import the scala.collection.mutable package, you can create an empty mutable map like this:
scala> val map = mutable.Map.empty[String, Int]
map: scala.collection.mutable.Map[String,Int] = Map()

scala> map("hello") = 1
scala> map("there") = 2
scala> map
  res20: scala.collection.mutable.Map[String,Int] =Map(hello -> 1, there -> 2)
Default sets and maps
For sets with fewer than five elements, a special class devoted exclusively to sets of each particular size is used, to maximize performance. Once you request a set that has five or more elements in it, however, the factory method will return an implementation that uses hash tries. As with sets, for immutable maps with fewer than five elements, a special class devoted exclusively to maps of each particular size is used, to maximize performance. Once a map has five or more key-value pairs in it, however, an immutable HashMap is used.
Sorted sets and maps
You may need a set or map whose iterator returns elements in a particular order. For this purpose, the Scala collections library provides traits SortedSet and SortedMap. These traits are implemented by classes TreeSet and TreeMap.
scala> val ts = TreeSet(9, 3, 1, 8, 0, 2, 7, 4, 6, 5)
ts: scala.collection.immutable.TreeSet[Int] = TreeSet(0, 1, 2, 3, 4, 5, 6, 7, 8, 9)

scala> val cs = TreeSet('f', 'u', 'n')
cs: scala.collection.immutable.TreeSet[Char] = TreeSet(f, n, u)

//And here are a few TreeMap examples:

scala> var tm = TreeMap(3 -> 'x', 1 -> 'x', 4 -> 'x')
tm: scala.collection.immutable.TreeMap[Int,Char]= Map(1 -> x, 3 -> x, 4 -> x)

scala> tm += (2 -> 'x')
scala> tm
res30: scala.collection.immutable.TreeMap[Int,Char]= Map(1 -> x, 2 -> x, 3 -> x, 4 -> x)
Selecting mutable versus immutable collections
When in doubt, it is better to start with an immutable collection and change it later if you need to, because immutable collections can be easier to reason about than mutable ones.
Besides being potentially easier to reason about, immutable collections can usually be stored more compactly than mutable ones if the number of el- ements stored in the collection is small. For instance an empty mutable map in its default representation of HashMap takes up about 80 bytes and about 16 more are added for each entry that’s added to it. An empty immutable Map is a single object that’s shared between all references, so referring to it es- sentially costs just a single pointer field. What’s more, the Scala collections library currently stores immutable maps and sets with up to four entries in a single object, which typically takes up between 16 and 40 bytes, depending on the number of entries stored in the collection.4 So for small maps and sets, the immutable versions are much more compact than the mutable ones.
Initialise a collection with another collection
 
You need to create an empty TreeSet and pass elements of the set to it:
scala> val colors = Set(“blue”, “red”, “green")
scala> val orderedColors = TreeSet[String]() ++ colors
treeSet: scala.collection.immutable.TreeSet[String] = TreeSet(blue, green, red)
Tuples
A tuple combines a fixed number of items together so that they can be passed around as a whole. Unlike an array or list, a tuple can hold objects with different types. Here is an example of a tuple holding an integer, a string, and the console:
(1, "hello", Console)
A common application of tuples is returning multiple values from a method. For example:
def longestWord(words: Array[String]) = {
     var word = words(0)
     var index = 0

     for (i <- 1 to words.length) {
          if (words(i).length > word.length) {
               word = words(i)
               index = i
          }
     }

     (word, index)
}

scala> val (word, index) = longestWord(“The quick brown fox”.split(“ “))
scala> println(s”longest word is $word with index of $index”)
>> longest word is quick with index of 1 
Uncategorized

Lists

Lists are quite similar to arrays, but there are two important differences. First, lists are immutable. That is, elements of a list cannot be changed by assignment. Second, lists have a recursive structure (i.e., a linked list), whereas arrays are flat.
Example of Lists with explicit types added:
val fruits: List[String] = List("Apple", "Orange")
val empty: List[Nothing] = List()
Nothing is subtype of every other Scala type. So it’s possible to write this:
val empty: List[String] = List()
All operations on lists can be expressed in terms of the following three:
  • head returns the first element of a list
  • tail returns a list consisting of all elements except the first
  • isEmpty returns true if the list is empty
Pattern matching on lists
val List(apple, orange, banana) = fruits

//if you don’t know the length of the list:
val a :: b :: rest = fruits
List(...) is an instance of a library-defined extractor pattern.
First-order methods on class List
A method is first-order if it does not take any functions as arguments.
Concatenating two lists
List(1, 2) ::: List(3, 4, 5)
Accessing the end of a list: init and last
  • last returns last element of List
  • init returns all elements, except the last one
Unlike head and tail, which both run in constant time, init and last need to traverse the whole list to compute their result. They therefore take time proportional to the length of the list.
Prefixes and suffixes: drop, take, and splitAt
 
scala> abcde take 2
res8: List[Char] = List(a, b)

scala> abcde drop 2
res9: List[Char] = List(c, d, e)

scala> abcde splitAt 2
res10: (List[Char], List[Char]) = (List(a, b),List(c, d, e))
Element selection: apply and indices
scala> abcde apply 2 // rare in Scala, same as abcde(2)
res11: Char = c

scala> abcde.indices
res13: scala.collection.immutable.Range = Range(0, 1, 2, 3, 4)
Element selection is less popular for lists than for arrays because xs(n) takes time proportional to the index n.
Flattening a list of lists: flatten
The flatten method takes a list of lists and flattens it out to a single list:
scala> List(List(1, 2), List(3), List(), List(4, 5)).flatten
res14: List[Int] = List(1, 2, 3, 4, 5)
It can only be applied to lists whose elements are all lists. Trying to flatten any other list will give a compilation error.
Zipping lists: zip and unzip
The zip operation takes two lists and forms a list of pairs:
scala> abcde.indices zip abcde
res17: scala.collection.immutable.IndexedSeq[(Int, Char)] =
IndexedSeq((0,a), (1,b), (2,c), (3,d), (4,e))
If the two lists are of different length, any unmatched elements are dropped.
Any list of tuples can also be changed back to a tuple of lists by using the unzip method:
scala> zipped.unzip
res19: (List[Char], List[Int]) = (List(a, b, c),List(1, 2, 3))
Displaying lists: toString and mkString
The toString operation returns the canonical string representation of a list.
If you want a different representation you can use themkStringmethod. The operationxs mkString (pre, sep, post)involves four operands: the listxsto be displayed, a prefix stringpreto be displayed in front of all elements, a separator stringsepto be displayed between successive elements, and a postfix stringpostto be displayed at the end. The result of the operation is the string
Converting lists: iterator, toArray, copyToArray
To convert data between the flat world of arrays and the recursive world of lists, you can use method toArray in class List and toList in class Array.
 
Higher-order methods on class List
 
Mapping over lists: map, flatMap and foreach
The operation xs map f takes as operands a list xs of type List[T] and a function f of type T => U. It returns the list resulting from applying the function f to each list element in xs. For instance:
scala> List(1, 2, 3) map (_ + 1)
res32: List[Int] = List(2, 3, 4)

scala> val words = List("the", "quick", "brown", "fox")
words: List[java.lang.String] = List(the, quick, brown, fox)

scala> words map (_.length)
res33: List[Int] = List(3, 5, 5, 3)

scala> words map (_.toList.reverse.mkString)
res34: List[String] = List(eht, kciuq, nworb, xof)
The flatMap operator is similar to map, but it takes a function returning a list of elements as its right operand. Wheremapreturns a list of lists,flatMapreturns a single list in which all element lists are concatenated.
Unlike map and flatMap, foreach takes a procedure (a function with result type Unit) as right operand. It simply applies the procedure to each list element. The result of the operation itself is again Unit; no list of results is assembled.
scala> var sum = 0
sum: Int = 0

scala> List(1, 2, 3, 4, 5) foreach (sum += _)

scala> sum
res39: Int = 15
Filtering lists: filter, partition, find, takeWhile, dropWhile, and span
The operation “xs filter p” takes as operands a list xs of type List[T] and a predicate function p of type T => Boolean. It yields the list of all elements x in xs for which p(x) is true. For instance:
scala> List(1, 2, 3, 4, 5) filter (_ % 2 == 0) res40: List[Int] = List(2, 4)
scala> words filter (_.length == 3)
res41: List[java.lang.String] = List(the, fox)
The partition method is like filter, but it returns a pair of lists. One list contains all elements for which the predicate is true, while the other list contains all elements for which the predicate is false.
The find method is also similar to filter but it returns the first element satisfying a given predicate, rather than all such elements. It returns an optional value (Some/None).
The takeWhile and dropWhile operators also take a predicate as their right operand. The operation xs takeWhile p takes the longest prefix of list xs such that every element in the prefix satisfies p. Analogously, the operation xs dropWhile p removes the longest prefix from list xs such that every element in the prefix satisfies p. Here are some examples:
scala> List(1, 2, 3, -4, 5) takeWhile (_ > 0)
res45: List[Int] = List(1, 2, 3)

scala> words dropWhile (_ startsWith "t")
res46: List[java.lang.String] = List(quick, brown, fox)
The span method combines takeWhile and dropWhile in one operation, just like splitAt combines take and drop. It returns a pair of two lists, defined by the equality.
scala> List(1, 2, 3, -4, 5) span (_ > 0)
res47: (List[Int], List[Int]) = (List(1, 2, 3),List(-4, 5))
Predicates over lists: forall and exists
The operation xs forall p takes as arguments a list xs and a predicate p. Its result is true if all elements in the list satisfy p. Conversely, the operation xs exists p returns true if there is an element in xs that satisfies the predicate p.
val variedNumList: List[Int] = List(1, 2, 3, -4, -5)
variedNumList forall(_ > 3) //returns false
variedNumList exists(_ < 3) //returns true
Folding lists: /: and :\
Another common kind of operation combines the elements of a list with some operator.
scala> def sum(xs: List[Int]): Int = (0 /: xs) (_ + _)
sum: (xs: List[Int])Int
A fold left operation “(z /: xs) (op)” involves three objects: a start value z, a list xs, and a binary operation op. The result of the fold is op applied between successive elements of the list prefixed by z.
To concatenate all words in a list of strings with spaces between them and in front, you can write this:
scala>  ("" /: words) (_ +" "+ _)
res49: java.lang.String =  the quick brown fox
Sorting lists: sortWith
The operation xs sortWith before, where “xs” is a list and “before” is a function that can be used to compare two elements, sorts the elements of list xs. The expression x before y should return true if x should come before y in the intended ordering for the sort. For instance:
scala> List(1, -3, 4, 2, 6) sortWith (_ < _)
res51: List[Int] = List(-3, 1, 2, 4, 6)

scala> words sortWith (_.length > _.length)
res52: List[java.lang.String] = List(quick, brown, the, fox)
Methods of the List object
The following are methods available on the List companion object.
Creating a range of numbers: List.range
This creates a list that includes a range of numbers. It’s possible to supply a third parameter, as a “step”. For example:
scala> List.range(1, 5)
res54: List[Int] = List(1, 2, 3, 4)

scala> List.range(1, 9, 2)
res55: List[Int] = List(1, 3, 5, 7)

scala> List.range(9, 1, -3)
res56: List[Int] = List(9, 6, 3)
Creating uniform lists: List.fill
The fill method creates a list consisting of zero or more copies of the same element. It takes two parameters: the length of the list to be created, and the element to be repeated. Iffillis given more than two arguments, then it will make multi- dimensional lists.
scala> List.fill(5)('a')
res57: List[Char] = List(a, a, a, a, a)

scala> List.fill(2, 3)('b')
res59: List[List[Char]] = List(List(b, b, b), List(b, b, b))
Tabulating a function: List.tabulate
 
The tabulate method creates a list whose elements are computed according to a supplied function. Its arguments are just like those of List.fill:
scala> val squares = List.tabulate(5)(n => n * n)
squares: List[Int] = List(0, 1, 4, 9, 16)
Concatenating multiple lists: List.concat
The concat method concatenates a number of element lists. The lists to be  concatenated are supplied as direct arguments to concat:
scala> List.concat(List('a', 'b'), List('c'))
res60: List[Char] = List(a, b, c)
Processing multiple lists together
Thezippedmethod on tuples generalizes several common operations to work on multiple lists instead of just one. One such operation ismap. Themapmethod for two zipped lists maps pairs of elements rather than individ- ual elements.
scala> (List(10, 20), List(3, 4, 5)).zipped.map(_ * _)
res63: List[Int] = List(30, 80)
There are also zipped analogs toexistsandforall. They are just like the single-list versions of those methods except they operate on elements from multiple lists instead of just one:
scala> (List("abc", "de"), List(3, 2)).zipped.forall(_.length == _)
res64: Boolean = true

scala> (List("abc", "de"), List(3, 2)).zipped.exists(_.length != _)
res65: Boolean = false