Сообщения

Сообщения за февраль, 2019

Machine Learning with Spark (MultilayerPerceptronClassifier) and Scala

Изображение
I will use test data like in  the previous article (KMeans) We will train model, check it on test data and finally use it to classify (or clustering) points one by one. Generate test data set that contains 3 areas, drop table delit_test_data; create table delit_test_data as --cluster 1 select 1 as cluster_number, ROUND(2+(case when dbms_random.value<0.5 then -1 else +1 end)*dbms_random.value*1.5,2) as x, ROUND(2+(case when dbms_random.value<0.5 then -1 else +1 end)*dbms_random.value*1.5,2) as y from dual connect by rownum<=30 union all --cluster 2 select 2 as cluster_number, ROUND(8+(case when dbms_random.value<0.5 then -1 else +1 end)*dbms_random.value*2,2) as x, ROUND(2+(case when dbms_random.value<0.5 then -1 else +1 end)*dbms_random.value*2,2) as y from dual connect by rownum<=40 union all --cluster 3 select 3 as cluster_number, ROUND(6+(case when dbms_random.value<0.5 then -1 else +1 end)*dbms_random.value*2

Machine Learning with Spark (KMeans examples) and Scala

Изображение
This article just my first step in Spark ML. You can read the short description on wiki -  K-means_clustering First of all, we need to check everything. 1) Test data, our data is a set of points in 2D, each point has 3 properties: X coordinate, Y coordinate, and a color. I am going to make 2 clustering, without and with colors. I use Oracle SQL (because it's under the hand) to generate the test data. Special select 3 cluster centers and generate random drop table delit_test_data; create table delit_test_data as --cluster 1 select 1 as cluster_number, ROUND(2+(case when dbms_random.value<0.5 then -1 else +1 end)*dbms_random.value*1.5,2) as x, ROUND(2+(case when dbms_random.value<0.5 then -1 else +1 end)*dbms_random.value*1.5,2) as y from dual connect by rownum<=30 union all --cluster 2 select 2 as cluster_number, ROUND(8+(case when dbms_random.value<0.5 then -1 else +1 end)*dbms_random.value*2,2) as x, ROUND(2+(case when dbms_r