Scala tasks and questions - solutions and explanations

#1. Why method generates an error at compile time:

Error here because the expression is expanding into


SOMETHING match {
  case xs :Seq[Int] if xs.isEmpty => 0
  case xs :Seq[Int] if xs.nonEmpty => xs.head + recursiveSum(xs.tail: _*)
}


and scala compiler doesn't know what is SOMETHING and his type. You think that it is xs : Int* parameter, but why. Functions definition can be like this


def recursiveSum(a :Int,s :String, xs :Int*) :Int = {


We can solve this by explicitly adding the parameter in the patter match, like this (
inside {} we use xs but it can be any name, it's not referred to internal parameter xs.
)


def recursiveSum(xs :Int*) :Int = {
  xs match {
    case xs if xs.isEmpty => 0
    case xs if xs.nonEmpty => xs.head + recursiveSum(xs.tail: _*)
  }
}

recursiveSum((1 to 5): _ *)
result : 15


Also, we can do it without pattern matching, with IF


val g: (Seq[Int]) => Int = si => {
  if (si.isEmpty) 0
  else si.head + g(si.tail)
}
g(Seq(1,2,3))
//res0: Int = 6

def calcRecursIf(i :Int*) :Int = {
  if (i.size==0) 0
  else
    i.head + calcRecursIf(i.tail : _*)
}
calcRecursIf(List(1,2,3,4,5) :_ *)
//res1: Int = 15


Or this way:

val f: Seq[Int] => Int = {
  case s if s.isEmpty => 0
  case s if s.nonEmpty => s.head + f(s.tail)
}
f(Seq(1,2,3,4,5))

because it extended in smth. like this:

val f: Seq[Int] => Int = {
  (_ :Seq[Int]) match {
    case s if s.isEmpty => 0
    case s if s.nonEmpty => s.head + f(s.tail)
  }
}

f(Seq(1,2,3,4,5))

res0: Int = 15




#2. How to make a chain of PartialFunctions, manually and with fold, and use it to filter data.
     And the short explanation about P.F. and their methods.

PartialFunction is a function that makes restrictions on the input parameter.

A partial function of type `PartialFunction[A, B]` is a unary function
where the domain does not necessarily include all values of type `A`. 

For example, you can write P.F. that "work" on Int with the exact interval, 0-10. Or > 10. Or !=0.

P.F. is a trait in file PartialFunction.scala package scala.
trait PartialFunction[-A, +B] extends (A => B)

P.F. trait declares method def isDefinedAt(x: A): Boolean that can be used for checking input parameter on a domain.

For the next examples, we will use next data structures.



case class Car(name :String, year :Int){
  override def toString :String = name+" "+year
}
case class Country(name :String)
case class Person(name: String, car :Car, country :Country){
  override def toString :String =
    s"$name - $country -  $car"
}

val listpers :Seq[Person] = Seq(
  Person("John",Car("mers",2013),Country("russia")),
  Person("Mark",Car("mers",2013),Country("usa")),
  Person("Jenya",Car("mers",2013),Country("russia")),
  Person("James",Car("mers",2013),Country("usa")),
  Person("Alex",Car("mers",2012),Country("russia")),
  Person("Smith",Car("mers",2013),Country("usa")),
  Person("Jasper",Car("mers",2013),Country("russia")),
  Person("Jastin",Car("mers",2013),Country("usa"))
)


In this structure, we describe peoples that have Cars and live in Country. So simple. The country has just name and Car name and year of production.

Farther we create type alias for PartialFunction[Person, Person]. Also, we will create P.F. that has type
A => A. Or Person => Person. In P.F. semantic it's A => B, but we want use it only for filtering and without changing type.


type pfPP = PartialFunction[Person,Person]

val pfCarFilter :pfPP ={
  case p if p.car.year==2013 => p
}

val pfCountryFilter :pfPP ={
  case p if p.country.name=="russia" => p
}

val pfPersonNameFilter :pfPP = {
  case p if p.name.startsWith("J") => p
}


Look at this carefully, we declare 3 p.f., each of them checks personal parameter: year of car, name of the country, and the first letter of a Person name.
We declare it as a functional variable. Also possible form: where we explicitly define methods apply and isDefinedAt


val pfCheckName = new pfPP {
  def apply(p: Person) = p
  def isDefinedAt(p: Person) = p.name.startsWith("J")
}


Or you can do it as a method:


def pfCheckName :pfPP = new pfPP{
  def apply(p: Person) = p
  def isDefinedAt(p: Person) = p.name.startsWith("J")
}


Next, I show how chaining partial functions in single partial function. And use it for filtering.

P.F-s can be "connected" in different ways:

1) compose (applied from right to left)
/**
 * Composes another partial function `k` with this partial function so that this
 * partial function gets applied to results of `k`.
*/

2) andThen (applied from left to right)
/**
 * Composes this partial function with another partial function that
 * gets applied to results of this partial function.
*/

3) orElse
/** Composes this partial function with a fallback partial function which
 *  gets applied where this partial function is not defined.
*/

Generally, there are logical AND and OR, that can be used to build joins.

For example, we can combine our p.f. in forms.


val commonFilter = pfPersonNameFilter andThen pfCountryFilter andThen pfCarFilter


This p.f. commonFilter has the next domain:
 ONLY Persons with the first letter in name equal "J" AND they must live in Russia (pfCountryFilter) AND them cars must be made in the 2013 year.


val commonFilter = pfPersonNameFilter andThen pfCountryFilter andThen pfCarFilter
listpers collect commonFilter map println

commonFilter: PartialFunction[Person,Person] = 
John - Country(russia) -  mers 2013
Jenya - Country(russia) -  mers 2013
Jasper - Country(russia) -  mers 2013
res1: Seq[Unit] = List((), (), ())



p.s. collect and map will be described at the end of this block.

Here look at explicit manual combining of p.f. with using andThen.
But what if we have a sequence of partial functions (with size 10, 50 ...) that populated from different sources, dynamically. We can combine partial functions with using foldLeft (just as one example).
In the accumulator as an initial value, we put head (first P.F) and next fold seq by tail.


val seqPfF :Seq[pfPP] = Seq(pfPersonNameFilter, pfCountryFilter, pfCarFilter)
val combFilters = seqPfF.tail.foldLeft(seqPfF.head)(_ andThen _)

listpers collect combFilters map println

seqPfF: Seq[pfPP] = List(, , )
combFilters: pfPP = 

John - Country(russia) -  mers 2013
Jenya - Country(russia) -  mers 2013
Jasper - Country(russia) -  mers 2013
res0: Seq[Unit] = List((), (), ())



Also, PartialFunction has next methods:
1) lift -
/**
 * Turns this partial function into a plain function returning an `Option` result.
*/
return None when isDefinedAt = false.


listpers map pfCountryFilter.lift map println

Some(John - Country(russia) -  mers 2013)
None
Some(Jenya - Country(russia) -  mers 2013)
None
Some(Alex - Country(russia) -  mers 2012)
None
Some(Jasper - Country(russia) -  mers 2013)
None
res0: Seq[Unit] = List((), (), (), (), (), (), (), ())



2) runWith
/** Composes this partial function with an action function which
 *  gets applied to results of this partial function.
 *  The action function is invoked only for its side effects; its result is ignored.
*/


listpers.toList.map(p => pfCountryFilter.runWith(println)(p))

John - Country(russia) -  mers 2013
Jenya - Country(russia) -  mers 2013
Alex - Country(russia) -  mers 2012
Jasper - Country(russia) -  mers 2013
res0: List[Boolean] = List(true, false, true, false, true, false, true, false)



3) condOpts
/** Transforms a PartialFunction[T, U] `pf` into Function1[T, Option[U]] `f`
 *  whose result is `Some(x)` if the argument is in `pf`'s domain and `None`
 *  otherwise
 */


import PartialFunction._
listpers.map(v => condOpt(v)(pfCountryFilter))

res0: Seq[Option[Person]] = List(
Some(John - Country(russia) -  mers 2013), 
None, 
Some(Jenya - Country(russia) -  mers 2013), 
None, 
Some(Alex - Country(russia) -  mers 2012), 
None, 
Some(Jasper - Country(russia) -  mers 2013), 
None)



4) ApplyOrElse
/** Applies this partial function to the given argument when it is contained in the function domain.
 *  Applies fallback function where this partial function is not defined.
 */
Apply pfCountryFilter for elements in the domain and function println (that print into console and return Unit) to elements out of the domain.


listpers.map(p => pfCountryFilter.applyOrElse(p,println))

Mark - Country(usa) -  mers 2013
James - Country(usa) -  mers 2013
Smith - Country(usa) -  mers 2013
Jastin - Country(usa) -  mers 2013
res0: Seq[Any] = List(
John - Country(russia) -  mers 2013, 
(), 
Jenya - Country(russia) -  mers 2013, 
(), 
Alex - Country(russia) -  mers 2012, 
(), 
Jasper - Country(russia) -  mers 2013, 
()
)



Important moment: method collect get PartialFunction and internally check isDefinedAt.
method map waits any function.


  def     map[B](f: A => B)  :List[B]
  def collect[B](pf: PartialFunction[A, B])  :List[B]


and two similar examples


listpers.map(p => pfCountryFilter(p))

scala.MatchError: Mark - Country(usa) -  mers 2013 (of class Person)

listpers.collect(pfCountryFilter)

res0: Seq[Person] = List(
John - Country(russia) -  mers 2013, 
Jenya - Country(russia) -  mers 2013, 
Alex - Country(russia) -  mers 2012, 
Jasper - Country(russia) -  mers 2013)


map raise Match Error because it doesn't check isDefinedAt


Комментарии

Отправить комментарий

Популярные сообщения из этого блога

Loading data into Spark from Oracle RDBMS, CSV

Load data from Cassandra to HDFS parquet files and select with Hive

Hadoop 3.0 cluster - installation, configuration, tests on Cent OS 7