Category Archives: Me

Installing Nodejs oracledb module on Suse SLES 11

Please follow the info at but

but remember to use gcc compiler release 5.0

export ORACLE_HOME=/home/oracle/instantclient_12_1
export OCI_INC_DIR=$ORACLE_HOME/sdk/include
CC=gcc-5 CXX=g++-5 npm install oracledb

ldiff2sql: How to import ldap data to a database

Export your data

ldapsearch -o ldif-wrap=no -E pr=1000/noprompt -x -h -D "CN=admin,OU=users,DC=redaelli,DC=org" -w mypwd -s sub -b "DC=redaelli,DC=org" "(objectclass=computer)" dNSHostName description operatingSystem operatingSystemVersion  -LLL > ad-computer-sa.ldiff
rm hosts.csv

Converto to sql

Deploying microservices in a Docker container

I already spoke about docker containers (moving datacenters apps from virtual machines to containers)

This is a quick tutorial (my github sample code) about a new way of deploying (micro) services and applications, ie using Docker containers: a sample python webservice and an simple web (html + angularJS code) page

Creating docker containers means defining a file Dockerfile like

FROM python:3.5
#FROM python:3-onbuild

ENV DEBIAN_FRONTEND noninteractive

ENV http_proxy=""
ENV https_proxy=""

COPY requirements.txt /usr/src/app/
COPY /usr/src/app/

WORKDIR /usr/src/app
RUN apt-get update && apt-get install -y nmap
RUN pip install --proxy $HTTP_PROXY --no-cache-dir -r requirements.txt

VOLUME ["/usr/src/app"]

ENTRYPOINT ["python"]
CMD ["./"]

Put the additional python packages you need in a file requirements.txt


And create your application in the file

In this way we are going to create a docker container with python3 and some additional python packages with the command

docker build -t python-infra-ws .

Finally we’ll start the container with the command

docker run -d -t --name python-infra-ws -p 5000:5000 python-infra-ws

Some other useful commands are:

docker stop python-infra-ws
docker start python-infra-ws
docker ps python-infra-ws
docker rm python-infra-ws
docker rmi python-infra-ws

Analyzing huge sensor data in near realtime with Apache Spark Streaming

For this demo I downloaded and installed Apache Spark 1.5.1

Suppose you have a stream of data from several (industrial) machines like

1,2015-01-01 11:00:01,1.0,1.1,1.2,1.3,..
2,2015-01-01 11:00:01,2.2,2.1,2.6,2.8,.
3,2015-01-01 11:00:01,1.1,1.2,1.3,1.3,.
1,2015-01-01 11:00:02,1.0,1.1,1.2,1.4,.
1,2015-01-01 11:00:02,1.3,1.2,3.2,3.3,..

Below a system, written in Python, that reads data from a stream (use the command “nc -lk 9999” to send data to the stream) and every 10 seconds collects alerts from signals: at least 4 suspicious values of a specific signal of the same machine

from pyspark import SparkContext
from pyspark.streaming import StreamingContext

min_occurs = 4

def signals_from_1_row_to_many(row):
  "output is (machine, date, signal_number, signal_value)"
  result = []
  for f in range(2,21):
    result = result + [(row[0], row[1], f-1, row[f])]
  return result

def isAlert(signal, value):
  defaults = [83.0, 57.0, 37.0, 57.0, 45.0, 19.0, -223.0, 20.50, 20.42, 20.48, 20.24, 20.22, 20.43, 20, 20.44, 20.39, 20.36, 20.25, 1675.0]
  soglia = 0.95
  if value == '':
     return True
  value = float(value)
  ref = defaults[signal -1]
  if value < ref - soglia*ref or value > ref + soglia*ref:
    return True
    return False
def isException(machine, signal):
  # sample data. the sensor 19 of machine 11 is broken
  exceptions = [(11,19)]
  return (int(machine), signal) in exceptions 

# Create a local StreamingContext with two working thread and batch interval of 10 second
sc = SparkContext("local[2]", "SignalsAlerts")
ssc = StreamingContext(sc, 10)

# Create a DStream that will connect to hostname:port, like localhost:9999
lines = ssc.socketTextStream("localhost", 9999)

all_alerts = l: l.split(",")) \
                 .flatMap(signals_from_1_row_to_many) \
                 .filter(lambda s: isAlert(s[2], s[3])) \
                 .filter(lambda s: not isException(s[0], s[2])) \
                 .map(lambda s: (s[0]+'-'+str(s[2]), [(s[1], s[3])])) \
                 .reduceByKey(lambda x, y: x + y) 

alerts = all_alerts.filter(lambda s: len(s[1]) > min_occurs)


ssc.start()             # Start the computation
ssc.awaitTermination()  # Wait for the computation to terminate

TwitterPopularTags.scala example of Apache Spark Streaming in a standalone project

This is an easy tutorial of using Apache Spark Streaming with Scala language using the official  TwitterPopularTags.scala example and putting it in a standalone sbt project.


In few minutes you will be able to receive streams of tweets and manipulating then in realtime with  Apache Spark Streaming

  • Install Apache Spark (I used 1.5.1)
  • Install sbt
  • git clone
  • cd TwitterPopularTags
  • cp
  • edit
  • sbt package
  • spark-submit –master local –packages “org.apache.spark:spark-streaming-twitter_2.10:1.5.1” ./target/scala-2.10/twitterpopulartags_2.10-1.0.jar italy

Howto collecting twitter data in 15 minutes

For this tutorial I assume you are using a  Debian/Ubuntu Linux system but it could be easily adapted for other Openrating Systems

Install the software

apt-get install openjdk-7-jdk  
tar xvfz apache-karaf-4.0.2.tar.gz

Start the server

cd apache-karaf-4.0.2/

Install additional connectors

ssh -p 8101 karaf@localhost
feature:repo-add camel 2.16.0
feature:install camel camel-blueprint camel-twitter camel-jackson camel-dropbox

Configure our routes

Create two new files:


<?xml version="1.0" encoding="UTF-8"?>
<blueprint xmlns=""

  <camelContext id="twitter-to-file" streamCache="true" xmlns="">

      <json id="jack" library="Jackson" />
      <jaxb id="myJaxb" prettyPrint="true" contextPath="org.apache.camel.example"/>

    <route id="twitter-tweets-to-file">
      <from uri="vm:twitter-tweets-to-file" />
      <setHeader headerName="CamelFileName">
        <to uri="vm:twitter-tweet-to-file" />

    <route id="twitter-tweet-to-file">
      <from uri="vm:twitter-tweet-to-file" />
      <log message="Saving tweet id= ${}" />
      <!-- transforming the body (a single tweet) to a json doc -->
      <marshal ref="jack" />
      <convertBodyTo type="java.lang.String" charset="UTF8" />
      <setHeader headerName="CamelFileName">
      <to uri="file:twitter-data?autoCreate=true&amp;fileExist=Append" />


<blueprint xmlns="">
  <camelContext id="twitter-search-sample" xmlns="">
    <route id="twitter-search-sample">
      <from uri="twitter://streaming/sample?count=100&amp;type=polling&amp;consumerKey=XXX&amp;consumerSecret=XXX&amp;accessToken=XXX&amp;accessTokenSecret=XXX" />
      <setHeader headerName="twitter-id">
      <to uri="vm:twitter-tweets-to-file" />


and copy then in the “deploy” directory. Check logs in data/log/karaf.log and see results in the folder twitter-data/sample/yyyy/mm/dd


Good lucks


About Cayley a scalable graph database


This is fast tutorial of using the Caylay graph database (with MongoDB as backend): Cayley is “not a Google project, but created and maintained by a Googler, with permission from and assignment to Google, under the Apache License, version 2.0”

"database": "mongo",
"db_path": "",
"read_only": false,
"host": ""
  • ./cayley init -config=cayley.cfg
  • ./cayley http -config=cayley.cfg -host=”″ &
  • create a file demo.n3
"/user/matteo" "is_manager_of" "/user/ele" .
"/user/matteo" "has" "/workstation/wk0002" .
"/user/matteo" "lives_in" "/country/italy" .
  • upload data with: curl -F NQuadFile=@demo.n3
  • or: ./cayley load –config=cayley.cfg  -quads=demo.n3
  • query data with: curl –data ‘g.V(“/user/matteo”).Out(null,”predicate”).All()’
 "result": [
   "id": "/workstation/wk0002",
   "predicate": "has"
   "id": "/country/italy",
   "predicate": "lives_in"
   "id": "/user/ele",
   "predicate": "is_manager_of"