Author Archives: matteo

Adding an application (angularjs+rest api) inside a WordPress site

If you need to integrate an application written with AngularJS and Rest API services in your wordpress website, just create an empy page and edit it in “text” mode with something like

<!-- the following two lines can be put in the header template --> 
<script src="https://ajax.googleapis.com/ajax/libs/angularjs/1.5.3/angular.min.js"></script>
<script src="http://your.site.com/app.js"></script>

<div ng-app="myApp" ng-controller="planetController">
       <div >
           <input ng-model="query" placeholder="inserisci una parola" type="text">
            <p><button ng-click="searchV(query)" >Dividi in sillabe</button></p>
       </div>
</div

A running example is (now, but in the near future I’ll switch to a generated static web site) at http://rapid.tips/site/sillabazione-parole-italiane/

Installing Nodejs oracledb module on Suse SLES 11

For a quick tutorial about installing Oracle module for Nodejs (oracledb) on Suse SLES, follow the info at

Node-OracleDB Installation

but remember to use the gcc compiler release 5.0

export ORACLE_HOME=/home/oracle/instantclient_12_1
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$ORACLE_HOME
export TNS_ADMIN=$ORACLE_HOME
export OCI_LIBRARY_PATH=$ORACLE_HOME
export OCI_LIB_DIR=$ORACLE_HOME
export OCI_INC_DIR=$ORACLE_HOME/sdk/include
CC=gcc-5 CXX=g++-5 npm install oracledb

ldiff2sql: How to import ldap data to a database

Export your data

ldapsearch -o ldif-wrap=no -E pr=1000/noprompt -x -h ldapserver.redaelli.org -D "CN=admin,OU=users,DC=redaelli,DC=org" -w mypwd -s sub -b "DC=redaelli,DC=org" "(objectclass=computer)" dNSHostName description operatingSystem operatingSystemVersion  -LLL > ad-computer-sa.ldiff
rm hosts.csv

Converto to sql

Deploying microservices in a Docker container

I already spoke about docker containers (moving datacenters apps from virtual machines to containers)

This is a quick tutorial (my github sample code) about a new way of deploying (micro) services and applications, ie using Docker containers: a sample python webservice and an simple web (html + angularJS code) page

Creating docker containers means defining a file Dockerfile like

FROM python:3.5
#FROM python:3-onbuild

ENV DEBIAN_FRONTEND noninteractive

ENV HTTP_PROXY="http://myproxy.redaelli.org:80"
ENV HTTPS_PROXY="http://myproxy.redaelli.org:80"
ENV http_proxy="http://myproxy.redaelli.org:80"
ENV https_proxy="http://myproxy.redaelli.org:80"
ENV PIP_OPTIONS="--proxy $HTTP_PROXY"

COPY requirements.txt /usr/src/app/
COPY app.py /usr/src/app/

WORKDIR /usr/src/app
RUN apt-get update && apt-get install -y nmap
RUN pip install --proxy $HTTP_PROXY --no-cache-dir -r requirements.txt

VOLUME ["/usr/src/app"]
EXPOSE 5000

ENTRYPOINT ["python"]
CMD ["./app.py"]

Put the additional python packages you need in a file requirements.txt

Flask
python-nmap
dnspython3

And create your application in the file app.py

In this way we are going to create a docker container with python3 and some additional python packages with the command

docker build -t python-infra-ws .

Finally we’ll start the container with the command

docker run -d -t --name python-infra-ws -p 5000:5000 python-infra-ws

Some other useful commands are:

docker stop python-infra-ws
docker start python-infra-ws
docker ps python-infra-ws
docker rm python-infra-ws
docker rmi python-infra-ws

Analyzing huge sensor data in near realtime with Apache Spark Streaming

For this demo I downloaded and installed Apache Spark 1.5.1

Suppose you have a stream of data from several (industrial) machines like

MACHINE,TIMESTAMP,SIGNAL1,SIGNAL2,SIGNAL3,...
1,2015-01-01 11:00:01,1.0,1.1,1.2,1.3,..
2,2015-01-01 11:00:01,2.2,2.1,2.6,2.8,.
3,2015-01-01 11:00:01,1.1,1.2,1.3,1.3,.
1,2015-01-01 11:00:02,1.0,1.1,1.2,1.4,.
1,2015-01-01 11:00:02,1.3,1.2,3.2,3.3,..
...

Below a system, written in Python, that reads data from a stream (use the command “nc -lk 9999” to send data to the stream) and every 10 seconds collects alerts from signals: at least 4 suspicious values of a specific signal of the same machine

from pyspark import SparkContext
from pyspark.streaming import StreamingContext

min_occurs = 4

def signals_from_1_row_to_many(row):
  "output is (machine, date, signal_number, signal_value)"
  result = []
  for f in range(2,21):
    result = result + [(row[0], row[1], f-1, row[f])]
  return result

def isAlert(signal, value):
  defaults = [83.0, 57.0, 37.0, 57.0, 45.0, 19.0, -223.0, 20.50, 20.42, 20.48, 20.24, 20.22, 20.43, 20, 20.44, 20.39, 20.36, 20.25, 1675.0]
  soglia = 0.95
  if value == '':
     return True
  value = float(value)
  ref = defaults[signal -1]
  if value < ref - soglia*ref or value > ref + soglia*ref:
    return True
  else:
    return False
  
def isException(machine, signal):
  # sample data. the sensor 19 of machine 11 is broken
  exceptions = [(11,19)]
  return (int(machine), signal) in exceptions 

# Create a local StreamingContext with two working thread and batch interval of 10 second
sc = SparkContext("local[2]", "SignalsAlerts")
ssc = StreamingContext(sc, 10)

# Create a DStream that will connect to hostname:port, like localhost:9999
lines = ssc.socketTextStream("localhost", 9999)

all_alerts = lines.map(lambda l: l.split(",")) \
                 .flatMap(signals_from_1_row_to_many) \
                 .filter(lambda s: isAlert(s[2], s[3])) \
                 .filter(lambda s: not isException(s[0], s[2])) \
                 .map(lambda s: (s[0]+'-'+str(s[2]), [(s[1], s[3])])) \
                 .reduceByKey(lambda x, y: x + y) 

alerts = all_alerts.filter(lambda s: len(s[1]) > min_occurs)

alerts.pprint()

ssc.start()             # Start the computation
ssc.awaitTermination()  # Wait for the computation to terminate

TwitterPopularTags.scala example of Apache Spark Streaming in a standalone project

This is an easy tutorial of using Apache Spark Streaming with Scala language using the official  TwitterPopularTags.scala example and putting it in a standalone sbt project.

 

In few minutes you will be able to receive streams of tweets and manipulating then in realtime with  Apache Spark Streaming

  • Install Apache Spark (I used 1.5.1)
  • Install sbt
  • git clone https://github.com/matteoredaelli/TwitterPopularTags
  • cd TwitterPopularTags
  • cp twitter4j.properties.sample twitter4j.properties
  • edit twitter4j.properties
  • sbt package
  • spark-submit –master local –packages “org.apache.spark:spark-streaming-twitter_2.10:1.5.1” ./target/scala-2.10/twitterpopulartags_2.10-1.0.jar italy

Howto collecting twitter data in 15 minutes

For this tutorial I assume you are using a  Debian/Ubuntu Linux system but it could be easily adapted for other Openrating Systems

Install the software

apt-get install openjdk-7-jdk  
wget http://apache.panu.it/karaf/4.0.2/apache-karaf-4.0.2.tar.gz
tar xvfz apache-karaf-4.0.2.tar.gz

Start the server

cd apache-karaf-4.0.2/
./bin/start

Install additional connectors

ssh -p 8101 karaf@localhost
feature:repo-add camel 2.16.0
feature:install camel camel-blueprint camel-twitter camel-jackson camel-dropbox
exit

Configure our routes

Create two new files:

twitter-to-file.xml

<?xml version="1.0" encoding="UTF-8"?>
<blueprint xmlns="http://www.osgi.org/xmlns/blueprint/v1.0.0"
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xmlns:camel="http://camel.apache.org/schema/blueprint"
       xsi:schemaLocation="
       http://www.osgi.org/xmlns/blueprint/v1.0.0 http://www.osgi.org/xmlns/blueprint/v1.0.0/blueprint.xsd
       http://camel.apache.org/schema/blueprint http://camel.apache.org/schema/blueprint/camel-blueprint.xsd">

  <camelContext id="twitter-to-file" streamCache="true" xmlns="http://camel.apache.org/schema/blueprint">

    <dataFormats>
      <json id="jack" library="Jackson" />
      <jaxb id="myJaxb" prettyPrint="true" contextPath="org.apache.camel.example"/>
    </dataFormats>

    <route id="twitter-tweets-to-file">
      <from uri="vm:twitter-tweets-to-file" />
      <setHeader headerName="CamelFileName">
         <simple>${in.header.twitter-id}</simple>
      </setHeader>
      <split>
        <simple>${body}</simple>
        <to uri="vm:twitter-tweet-to-file" />
      </split>
    </route>

    <route id="twitter-tweet-to-file">
      <from uri="vm:twitter-tweet-to-file" />
      <log message="Saving tweet id= ${body.id}" />
      <!-- transforming the body (a single tweet) to a json doc -->
      <marshal ref="jack" />
      <convertBodyTo type="java.lang.String" charset="UTF8" />
      <transform>
        <simple>${body}\n</simple>
      </transform>
      <setHeader headerName="CamelFileName">
        <simple>${in.header.CamelFileName}/${date:now:yyyy}/${date:now:MM}/${date:now:dd}</simple>
      </setHeader>
      <to uri="file:twitter-data?autoCreate=true&amp;fileExist=Append" />
    </route>
  </camelContext>
</blueprint>

twitter-streaming-sample.xml

<blueprint xmlns="http://www.osgi.org/xmlns/blueprint/v1.0.0">
  <camelContext id="twitter-search-sample" xmlns="http://camel.apache.org/schema/blueprint">
    <route id="twitter-search-sample">
      <from uri="twitter://streaming/sample?count=100&amp;type=polling&amp;consumerKey=XXX&amp;consumerSecret=XXX&amp;accessToken=XXX&amp;accessTokenSecret=XXX" />
      <setHeader headerName="twitter-id">
        <simple>sample</simple>
      </setHeader>
      <to uri="vm:twitter-tweets-to-file" />
    </route>

  </camelContext>
</blueprint>

and copy then in the “deploy” directory. Check logs in data/log/karaf.log and see results in the folder twitter-data/sample/yyyy/mm/dd

 

Good lucks

Matteo

About Cayley a scalable graph database

 

This is fast tutorial of using the Caylay graph database (with MongoDB as backend): Cayley is “not a Google project, but created and maintained by a Googler, with permission from and assignment to Google, under the Apache License, version 2.0”

{
"database": "mongo",
"db_path": "cayley.redaelli.org:27017",
"read_only": false,
"host": "0.0.0.0"
}
  • ./cayley init -config=cayley.cfg
  • ./cayley http -config=cayley.cfg -host=”0.0.0.0″ &
  • create a file demo.n3
"/user/matteo" "is_manager_of" "/user/ele" .
"/user/matteo" "has" "/workstation/wk0002" .
"/user/matteo" "lives_in" "/country/italy" .
  • upload data with: curl http://cayley.redaelli.org:64210/api/v1/write/file/nquad -F NQuadFile=@demo.n3
  • or: ./cayley load –config=cayley.cfg  -quads=demo.n3
  • query data with: curl –data ‘g.V(“/user/matteo”).Out(null,”predicate”).All()’ http://cayley.redaelli.org:64210/api/v1/query/gremlin
{
 "result": [
  {
   "id": "/workstation/wk0002",
   "predicate": "has"
  },
  {
   "id": "/country/italy",
   "predicate": "lives_in"
  },
  {
   "id": "/user/ele",
   "predicate": "is_manager_of"
  }
 ]