Mining on AWS

WARNING: THE METHOD DESCRIBED IN THIS ARTICLE IS ONLY TARGETING PEOPLE WHO HAVE AWS CREDIT TO LOOSE, DON’T USE IT WITH PROFITS IN MIND Considering cloud instances are usually expensive, and price of cryptocurrencies ( and especially the ones that are still minable with CPU and GPU ) are collapsing lately, most of you must think I’m turning mad. And, indeed, the target of this walkthrough is absolutely NOT BEING PROFITABLE.

An Introduction to Risk Analysis

An Introduction Far from proposing you a full formation to ISO 27005, this short post will introduce to you the basis to keep in mind before starting any new Security Project. Indeed, contrary to other investments, security won’t bring new value to your company Business; instead, it gives you the promise to protect your current value. As I’ve already discussed with students in a recent lecture, I gave on the Risks of IT Outsourcing, when you subscribe to a new outsourcing contract, concerning security, the External Service Provider (ESP) has an obligation of means he should apply rather than results.

Worldwide GPS tracks with OpenStreetMaps for urban design analysis

Dion Moult
, 03/07/2018 | Source: thinkMoult

I work as an architect, and one of the data available to us when masterplanning and early phases of an urban design project is GPS track activity. Knowing where people drive, where people walk, and cycle and recreate allows us to make decisions such as where to define architectural axes, where to place retail, and how to extend public transport and pedestrian walkways.

One of the resources available is from a company known as Strava, who runs a proprietary fitness social network, where fitness buffs can track their movements via GPS devices (which can be as simple as your phone) and compare cycling routes, distances run, and so on. Primarily used by runners and cyclists, these GPS logs are voluntarily uploaded to Strava, who then aggregates all the data and resells it to urban design parties, known as their “Strava Metro” initiative.

Publicly without purchasing any data, Strava also hosts a global GPS heatmap where you can visually see the fire of activity by runners and cyclists. Zooming in shows you right down to where people run down various streets. As a high-level overview, this is a great graphic and can immediately pinpoint activity. It also is a fantastic feat of engineering, processing 5 terabytes of raw input data. That’s big data!

Strava heatmap example

Of course, just recently Strava decided to stop showing this public heatmap at high zoom levels and locked it behind a paywall. Thankfully, there are alternatives.

In a previous post, I introduced an open-data initiative known as OpenStreetMaps. Strava is largely based on OpenStreetMaps and uses it as a base layer embedded into MapBox, and also has a fork of the OSM iD editor called “Strava Slide”, to allow people to edit routes based off strava GPS data tracks. However, OSM itself has many active GPS track contributors (used for various purposes, such as mapping new routes and calibrating the map), and we can use this open data in lieu of the proprietary product offered by Strava. Below, we see the world’s GPS tracks from the perspective of OSM visualised by Pascal Neis.

OSM GPX tracks

Before I get into the specifics of getting GPS data, I’d like to show you what data is in a GPS track. Here’s some GPS tracks visualised with JOSM. We can see things like speed, direction, and sometimes, elevation, if it is recorded. See those red segments of the line? Those are traffic lights!

GPS track velocity visualised with JOSM

OSM has an API and Planet GPS extract available to download GPS data. The Planet GPX is rather unwieldy, and is also very outdated (from 2013). The API is not the best either, in that it only returns 5,000 GPS points per query, and doesn’t quantify the total pages of results, so that you can’t really tell with one query how many points you need to fetch in advance. However, if you query the API and put a page number higher than what is available, it won’t return any points. So using a binary search you can find out how many pages to extract.

For the area of Sydney, there are roughly 750 pages of results, so that means just under 4 million GPS points. Here’s a heat map visualisation of it I made using QGIS (but JOSM also has a heat map visualisation feature). You won’t need huge supercomputers processing the data, either.

Sydney, Australia GPS activity heat map

Here’s another of Manhattan, New York.

Manhattan, New York, GPS activity heat map

We can couple this visualisation with other OSM data such as all public transport nodes. In this case, railway tracks, bus routes, ferry routes, cycling routes, train stations, and bus stops are shown. This is also created with QGIS.

Sydney public transport GPS analysis

There are a few pros and cons to using this GPS data. The pro is that it’s more general purpose: it’s not only used by runners and cyclists, it’s used by regular people (well, GIS geeks) doing everyday things like shopping. Unfortunately, OSM isn’t that widely used, and so the data is relatively sparse. In remote areas perhaps no-one has walked that route, or only a few people have. So you don’t get a sense of what they’re doing. The GPS data is also not processed, so you’ll have to do your own cleaning: especially in the city where GPS data goes a bit haywire with all the tall buildings.

Have fun and happy mapping!

The post Worldwide GPS tracks with OpenStreetMaps for urban design analysis appeared first on thinkMoult.

Edmondson Park – a retail and residential development by Frasers and HDR

Dion Moult
, 24/06/2018 | Source: thinkMoult

Three weeks ago, 30 kilometers away from the Sydney city centre in the rural suburb of Edmondson Park, Frasers Property Australia opened the doors of their display centre to the public. This brand new town centre development with residential and retail environments was designed by HDR Inc, of which I am part of the team of architects. I haven’t really talked much about my architecture work before, but a brand new town centre in a previously uneventful part of Sydney is perhaps worth a blog article.

Perhaps let’s start with the blurb of the development which I’ve copied directly from the Frasers official Ed Square website:

From the makers of Central Park Sydney, Ed.Square brings inner city edge, but with so much more than you expected.

Ed.Square is a diverse urban neighbourhood of restaurants and cafes; shopping and entertainment; playgrounds and parklands; a market place and Eat Street; adjacent to the Edmondson Park train station and all within walking distance from your own front door.

Ed offers an array of residences crafted by some of the worlds best architects that cater for every lifestyle. Whether you are a first home buyer or a multi-generational family, you’ll always feel at home with Ed.

Sydney’s South West is one of Sydney’s growth trajectories, and so the Edmondson Park development is one of those which will supply the population growth.

Despite working on the development, I probably don’t have any permission to use any marketing material, so you’ll have to visit the Frasers website to see all the pretty pictures and marketing.

However, I did take some snaps of the display suite, so here it is! Let’s start with the view you get as you enter. To the left are some display town houses. They are three storey products which surround the town centre. If you use some imagination you can read that there is the huge word “Ed.” written in bright yellow in front of it. Ed’s pretty hip, and is the anthropomorphism of the neighbourhood. To the right, you can see a cafe by the display centre itself, which apparently serves some pretty tasty dishes that the local community loves — but it was closed when I arrived.

Edmondson Park display village centre

Here’s what you see as you enter…

Edmondson Park display suite entrance

And a snap of the physical model…

Edmondson park model

Let’s zoom in! The orange letter “T” is the train station, so you can see that the town centre is literally adjacent to it. The white buildings have yet to be released, so stay tuned.

Henderson Road model shot

Here’s another angle, showing the grand cinema facade.

Edmondson Park cinema

There’s a display apartment too …

Edmondson Park display apartment

… which shows some of the apartment, such as this fancy kitchen.

Kitchen in display apartment

Here’s another view from the balcony of one of the town houses looking at the display suite. In the background you can see a huge pit where construction will occur, surrounded by Cumberland Plain Woodland. I hear there could be koalas living there.

enter image description here

If you live in Sydney, feel free to drop by!

The post Edmondson Park – a retail and residential development by Frasers and HDR appeared first on thinkMoult.

OpenStreetMaps – an open-source Maps application

Dion Moult
, 19/06/2018 | Source: thinkMoult

Recently I’ve been interested in an initiative known as OpenStreetMaps. Launched in 2004, OpenStreetMaps is the open-source equivalent of Google Maps, and functions largely like how Wikipedia does (and in fact was inspired by Wikipedia) – it’s a map of the world drawn completely by volunteers and open-source enthusiasts.

OpenStreetMaps world map

You might’ve already seen OSM in action. Below it’s used by default in the privacy-friendly search engine DuckDuckGo, other wiki-based projects like WikiVoyage, and many games use it as a base layer, such as Pokemon Go.

PokemonGo uses OpenStreetMaps as a base layer

You’ve probably used Google Maps before and have it installed on your phone to help you drive to places with the GPS. You may have also played with Bing Maps which essentially does the same thing. At first glance OpenStreetMaps is purely a clone: you can zoom in and out, look at street names and see buildings, and have it tell you how to drive to a destination. It’s not that exciting, and isn’t worth talking about.

However if you were a user of OSM, occasionally you might notice areas of the map where volunteers have gone above and beyond to draw details of the environment that other maps will not. Things like individual driveways, articulated building outlines, kerbside grass, wheelchair accessible walkways and kerb ramps, and individual bush and tree locations, fences, and parking niches. Zooming in we can identify storm drains, streetlamps, water taps and park benches. This level of detail is possible because the map is created by people who are genuinely interested and express a love and care in their work. The example below is in Brisbane, Australia, largely by a fellow called ThorstenE.

OpenStreetMaps example in Brisbane, Australia

Where OSM really excels is as an open-data resource. Usually, you are only limited to raster map images produced by Google Maps and Bing, but aren’t allowed to access the underlying database of geographic and vector information. In contrast, because the data in OSM’s database is free for everyone, specialist maps can easily be created. Take for example the extensive mapping of skiing and snowboarding tracks in Oslo, Norway provided by OpenSnowMap

OpenSnowMaps

… alternatively there is the Whitewater rafting map in the UK …

Open Whitewater Rafting maps

… and the OpenCycleMap which maps the world’s bicycle routes, and shows the incredible culture of pedestrian and cycling friendly urban planning in the Netherlands.

OpenCycleMaps example

OSM also helps lead the way in humanitarian mapping. When a flood, fire, earthquake or other natural disaster occurs, existing maps provided by Google and Bing are no longer current. Mappers need to create new maps to allow disaster relief teams to coordinate their efforts, target houses for rescues efficiently, or to know what routes relief organisations can take to navigate the terrain. This work is done by the excellent Humanitarian OpenStreetMap Team. It also includes non natural disasters, such as mapping demographics and environmental issues related to poverty elimination, gender equality, refugee response strategies, public disease outbreaks, clean energy, and water and sanitation. As one current example, right this minute the Monsoon rains have caused severe flooding in the Kurunagala and Puttalam districts of Sri Lanka. A map is being prepared so that first respondents and aid agencies can deliver relief supplies. A grid of zones with their mapping progress is updated in real time below.

Humanitarian OpenStreetMaps Team map in Nepal

As an open-source creation, it doesn’t data-mine your activity so you can use it as a Maps application without privacy concerns, you can download the raw vector so you can use it offline on your phone, and has a conservative approach to licensing data that allow people who want to embed OSM technologies in their own creations in a much more flexible manner. If you feel strongly about supporting privacy-aware applications (especially after the Cambridge Analytica scandal), and encouraging communities that aren’t motivated by profit, OpenStreetMaps should be something to consider. There are over 1,000,000 mappers who have contributed to OSM, and you can become one of those too.

One of the most amazing things about OSM is that whereas mapping the world is an inherently complex process, it has managed to make it easy and fun and doable by anybody who knows how to draw a rectangle with their mouse. Most of the other open-source initiatives have a high learning curve and lots of technical prerequisites, but OSM is completely the opposite. Just zoom into your city on OSM.org and click the Edit button on the top left. It will give you a short tutorial that lets you draw new roads and buildings within minutes. The thought that has gone into the user-friendliness of this online map editor is absolutely incredible.

OpenStreetMaps iD web-based editing software

I’ll talk about OSM a bit more in upcoming posts, and share some of the more interesting technical sides of things.

The post OpenStreetMaps – an open-source Maps application appeared first on thinkMoult.

The three container security golden rules

As containers became a standard in IT applications, enumerating a few security best practices is now a business need. Therefore I’ve defined those three golden rules to keep in mind before pushing a new image for production to your company container repository. I Careful with share volumes you will be Contrary to a Virtual Machine, a Docker container uses the host kernel directly, so in case of a kernel vulnerability restricted permissions on shared resources won’t protect you from an attacker.

Secure your site with Lambda@Edge

Let’s now deal with a more enjoyable subject. In my precedent posts, it was essentially related to Sysadmin aspects, I wrote those articles as I wasn’t able to find any satisfyingly complete reference online, so I’ve decided to write them. However, you probably see in my description corner that I am an Amazon Web Services certified, adept of DevOps and with a strong focus on the security aspects. As I’ve got freshly, AWS Security Specialty certified, it was high time I approached an AWS security oriented subject; I’m not going to describe how I’ve built and deployed this website, I myself found all the needed information here.

Ship your Applicative log files anywhere

As I recently had to manage an integration project for the Security Operation Center service of a big company, I had to configure applicative logs forwarding to the nearest SIEM syslog collector for each service included in the scope. I’ve found that the rsyslog agent is usually preinstalled in any Unix distribution with default operating system log folders configured out of the box so that the system log forwarding is most of the time almost as simple as service rsyslog start 1.

Deleting Facebook, and a reflection on digital privacy

Dion Moult
, 01/04/2018 | Source: thinkMoult

In the wake of the recent Cambridge Analytica privacy issue in the news, I have decided to #DeleteFacebook. The thinkMoult blog is still represented via the public Facebook thinkMoult page, but my private profile has been cleared out. Given that Facebook is increasingly sharing our profile data (as shown in the graph below produced from Facebook’s very own reports), clearing out the account makes a difference, albeit a small one. I also thought it would be good to share a few things I’ve learned about Facebook in the past couple of weeks, related to my new years resolution to improve digital security.

Facebook government requests over time

(Note: you can compare with Google’s data disclosure over time)

First, I’d like to commend Facebook’s behaviour so far. Being the world’s largest social network probably isn’t easy, and Facebook has made initiatives to increase its transparency. For instance, they issue a transparency report, and they use the Signal secure messaging protocol for a secure chat mode in FB Messenger. It is also possible to download your Facebook data, and place restrictions on data sharing with apps and advertisers. Their data retention policy also seems to suggest that if you delete data from your account, it’s also gone from their servers.

However, of course, this isn’t the complete picture. Take for instance the world map of Facebook government requests in the first half of 2017 from their very own transparency report.

Facebook government requests in 2017

The map (split into Jenks natural breaks) shows that US government requests are miles ahead of the rest of the world in asking Facebook for information. Most governments from other countries don’t play any part in this.

However, the map is incomplete. It is also not possible to see data shared through indirect means. Developers can easily create apps that integrate with Facebook. Whether you answer a survey through Facebook or use Facebook to log into another service, they can have varying degrees of access to your profile and friend information. This may also occur without your explicit consent. For instance, my meager Facebook usage has resulted in my details being shared with 138 companies. This is not to mention that Facebook trackers are on 25% of websites online. Oh, and let’s just forget Facebook altogether: Google trackers are on 75% of websites online (and yes, also on my blog). Basically, you are always tracked online, from the way you move your mouse to how you feel, which can be combined through machine learning to indirectly define character profiles, interests, and demographics.

Like most technologies, this data can be used for very positive things and very negative things alike. The negative side comes when services we assume are private social platforms are actually not. This data may be used to influence political elections, or help China rank all citizens, or rebrand political news as fake news in Malaysia, or even be accessed by any law enforcement agency around the world without notification or warrant – it doesn’t matter – people misunderstand that posting on Facebook is not a private matter: it is public.

Deleting Facebook is one step of many to promote the idea that just as there are public outlets for expression online (blogs, Twitter, Facebook) there equally are private outlets (Signal, Tor, ProtonMail). Of course, there is nothing inherently wrong with either outlet, but we should recognise these differences in privacy and know when to choose between them.

For more reading, see why digital rights matters, even though you don’t think it impacts you, and how you can improve human rights by changing your messaging app.

The post Deleting Facebook, and a reflection on digital privacy appeared first on thinkMoult.

How to download the Australian BioNet Database

Dion Moult
, 27/03/2018 | Source: thinkMoult

Did you know that there is a nest of endangered long nosed bandicoots living just beside the popular Manly beach in Sydney, Australia? Well, I didn’t, until I looked at BioNet. The Australian NSW government created BioNet as a government database of all flora and fauna species sightings in NSW. It’s absolutely fantastic. If you’re an architect and want to see how you might impact the urban ecosystem in NSW, look at BioNet. If you’re an ecologist of some kind, you probably already use it. If you’re just a good citizen who wants to remodel your back yard to improve urban ecology, BioNet is there for you.

Fortunately, BioNet comes with an online search system called Atlas. It’s simple to use, but unfortunately it has limits on the data it produces. It won’t show you all the fields associated with species, won’t show meta fields, and has a limit to the quantity of records shown. Thankfully, BioNet comes with an API which can be queried with programming knowledge. I’ve written a bit of Python which will allow you to download regions of data; but before we get to that, let’s see a graphic!

Sydney BioNet species map

I’ve plotted every species on the database close to Sydney in the map above. Size is relative to the number of species sighted (logarithmic relationship). I haven’t done any real filtering beyond this, so it’s not very meaningful, but it shows the data and shows it can be geolocated. It also looks like someone murdered the country, but I’ll post the interesting visualisations in a future post.

The Python code works in two parts. The first queries the API for json results divided into square tiles from a top left and bottom right latitude and longitude coordinate region. This’ll give you a bunch of *.json files in the current working directory. Edit the coordinates and resolution as necessary, and off you go. I’ve put in a series of fields that should be good for more general uses, but you can check the BioNet Data API for all fields.

import os

start = (-33.408554, 150.326152)
end = (-34.207799, 151.408916)

lat = start[0]
lon = start[1]

def create_url(lat, lon, lat_next, lon_next):
    return 'https://data.bionet.nsw.gov.au/biosvcapp/odata/SpeciesSightings_CoreData?$select=kingdom,catalogNumber,basisOfRecord,dcterms_bibliographicCitation,dataGeneralizations,informationWithheld,dcterms_modified,dcterms_available,dcterms_rightsHolder,IBRASubregion,scientificName,vernacularName,countryConservation,stateConservation,protectedInNSW,sensitivityClass,eventDate,individualCount,observationType,status,coordinateUncertaintyInMeters,decimalLatitude,decimalLongitude,geodeticDatum&$filter=((decimalLongitude ge ' + str(lon) + ') and (decimalLongitude le ' + str(lon_next) + ')) and ((decimalLatitude le ' + str(lat) + ') and (decimalLatitude ge ' + str(lat_next) + '))'

i = 0
resolution = 0.05

while (lat > end[0]):
    while (lon < end[1]):
        lat_next = round(lat - resolution, 6)
        lon_next = round(lon + resolution, 6)
        url = create_url(lat, lon, lat_next, lon_next).replace(' ', '%20').replace('\'', '%27')
        os.system('curl \'' + url + "\' -H 'Host: data.bionet.nsw.gov.au' -H 'User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Firefox/52.0' -H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8' -H 'Accept-Language: en-US,en;q=0.5' --compressed -H 'Cookie: NSC_EBUB_CJPOFU_443_mcwjq=ffffffff8efb154f45525d5f4f58455e445a4a423660' -H 'DNT: 1' -H 'Connection: keep-alive' -H 'Upgrade-Insecure-Requests: 1' -H 'Cache-Control: max-age=0' > " + str(i) + '.json')
        i += 1

        lon = round(lon + resolution, 6)
    lon = start[1]
    lat = round(lat - resolution, 6)

Now we’ll run another little script which will convert all the json files in the directory into a single csv file. You can read this csv file in programs like Excel or QGIS for further analysis.

import unicodecsv as csv
import json

f = csv.writer(open('bionet.csv', 'wb+'), encoding='utf-8')
number_of_json_files = 352

f.writerow([
    'IBRASubregion',
    'basisOfRecord',
    'catalogNumber',
    'coordinateUncertaintyInMeters',
    'countryConservation',
    'dataGeneralizations',
    'dcterms_available',
    'dcterms_bibliographicCitation',
    'dcterms_modified',
    'dcterms_rightsHolder',
    'decimalLatitude',
    'decimalLongitude',
    'eventDate',
    'geodeticDatum',
    'individualCount',
    'informationWithheld',
    'observationType',
    'protectedInNSW',
    'scientificName',
    'sensitivityClass',
    'stateConservation',
    'status',
    'kingdom',
    'vernacularName',
    ])
i = 0
while i < number_of_json_files:
    data = json.load(open(str(i) + '.json'))
    print(i)
    for x in data['value']:
        f.writerow([
            x['IBRASubregion'],
            x['basisOfRecord'],
            x['catalogNumber'],
            x['coordinateUncertaintyInMeters'],
            x['countryConservation'],
            x['dataGeneralizations'],
            x['dcterms_available'],
            x['dcterms_bibliographicCitation'],
            x['dcterms_modified'],
            x['dcterms_rightsHolder'],
            x['decimalLatitude'],
            x['decimalLongitude'],
            x['eventDate'],
            x['geodeticDatum'],
            x['individualCount'],
            x['informationWithheld'],
            x['observationType'],
            x['protectedInNSW'],
            x['scientificName'],
            x['sensitivityClass'],
            x['stateConservation'],
            x['status'],
            x['kingdom'],
            x['vernacularName'],
            ])
    i += 1

That’s it! Have fun and don’t forget to check for frogs in your backyards. If you don’t have any, build a pond. Or at least a water bath for the birds.

The post How to download the Australian BioNet Database appeared first on thinkMoult.