Category Archives: Tech

DockerCon 2016 Summary

Brief summary of DockerCon 2016 announcements on security, monitoring and company updates:

Announcements:

Key announcements on:

  • AWS and Azure integration
  • DABs
  • SwarmKit
  • Docker on Mac and Win
  • Security: 1. DTR 2. DSS 3. DCT/ImageSigning

Companies (from Sastry)

DataDog

  • Monitoring as a service: infrastructure and application
  • intelligent  alerting, insightful dash-boards
  • Collect data from containers, cloud providers, data stores, other monitoring providers all in one place:
  • metrics and metadata (tagging and labels from docker infrastructure), host map
    • Most intensive container or # of web requests for this application

Dynatrace ruxit

  • Entire stack – hosts, nodes, processes, microservices
  • discover dependencies which service connects to other services
  • Machine learning, no need to configure thresholds etc.
  • Java script errors -> database errors

Sysdig

  • Can be deployed as a container (based on a component deployed in kernel)
  • cluster, network, process, application level, java imx, response time, data base queries
  • aware of services, and understand the relationships, interaction of services
  • kubernets, mesos, docker swarm, amazon aws
  • Deployment and logical topology

Aqua immersive security for containers

  • Jenkins plug-in for scanning image for vulnerabilities before image push
  • Encrypting environment variables to protect secrets
  • REST API for free security scanner, highlights suspicious container behavior

SumoLogic

  • Saas
  • collect data via http post, agent in a container
  • Log signatures with machine learning – outlier, anomaly detection

BLACKDUCK know your code

  • Visibility into open source in containers
  • Identify open source, and enforce open source use policies
  • Identify vulnerabilities 3 weeks before NVD

Twistlock security built for containers

  • Docker containers are declarative (immutable images)
  • What software should be running, what ports  are open, container links
  • Runtime behavior – build models of runtime behavior and compare actual execution state against models

SignalFX

Data Management Solutions:

  • Hedvig software defined storage
  • crate.io scalable SQL database
  • Cluster hq container data management
  • Couchbase
  • Robin Systems: application-aware compute and storage platform,containers data persistence by controlling all layers

Network Solutions:

  • Weave network and management docker and microservices
  • Arista software defined networking
    container tracing -> which container is running on which node
  • Plum Grid  software defined networking

Container Management:

  • CloudSoft container service
  • EMC container platform
  • VMware automation for containers
  • Microsoft
  • Cisco
  • Joynent triton container as a service
  • Google cloud platform
  • Rackspace carina
  • Oracle
  • 1&1 managed cloud hosting
  • Rancher swarm kubernities meso opensource container mgmt
  • Apcera
  • Apprenda

 

 

 

BU Seminar: Seamless, Unified Operational Visibility and Analytics Designed for Cloud

I will be giving a talk on our recent work on cloud monitoring & operational analytics in Boston Univ. next week. The details are below. I am hoping to add what I learn from the day’s discusions later as well.

Photo Apr 28, 1 34 10 AM

Seamless, Unified Operational Visibility and Analytics Designed for Cloud

Emerging cloud services allow users to define and provision complex, distributed systems with unprecedented simplicity and agility. With the push of a button entire stacks of software can be instantiated within minutes with various configurations and customizations. Automation, continuous integration and delivery further simplify the entire lifecycle management of modern born-on-the-cloud applications. These advances also bring in additional research challenges and opportunities for cloud. Operational visibility into the complex, distributed user applications, cloud runtimes and the underlying infrastructure is becoming a persistent pain point for both the end users and the service/platform providers. As the system and configuration complexity grows, data-driven operational analytics for security, compliance, configuration and resource management are becoming key areas of interest. In both cases—of operational visibility and analytics—existing, traditional solutions are either ineffective or insufficient. Their assumptions are based on a different era of computing that no longer applies, with long-running, dedicated systems that can tolerate ample configuration times and resource overheads, and where it is common practice to push monitoring and analytics burden to the end user context. In this talk, I propose a fundamentally-different approach for designing seamless, unified and deep operational visibility and analytics services in the cloud. I first present Agentless System Crawler, a cloud-native framework, which leverages cloud, virtualization and containerization abstractions to provide complete visibility into running entities in the cloud, without modifying, instrumenting or accessing into the end user context. Our approach with crawlers is based on introspection without intrusion, and as-a-service without necessitating guest cooperation. I demonstrate how we use VM introspection and container namespace mapping techniques to provide a “touchless”, “out-of-band” and “always-on” cloud operational visibility service that is built into the platform. Cloud users simply register for this service, without worrying about the plumbing, overheads or side effects of gaining visibility into their environments. Cloud operators can leverage this approach to provide deeper operational insights without intervening with user environment, thus enabling a new set of cloud operational analytics as a service that are always on and available with the push of a button. In the second part of this talk, I discuss how we leverage this seamless, deep visibility today for building cloud-native operational analytics services for security, compliance, system discovery and misconfiguration analysis. Last, I present some of the open problems and some of our upcoming research directions.

Links From the End of the Talk for Those Interested in Learing More or Contributing:

Papers:

  • Operational Visibility: IC2E’14, Sigmetrics’14, VEE’15, HotCloud’15, ATC’16 (InterConnect’15)
  • Operational Analytics: BigData’14, IBM JRD’16:{SWDisc,NFM,DevOps} (InterConnect’16)

Blogs:

Demos:

Open Source:

Try It:

Discussed Open Problems as Promised:

  • Truly Seamless OpVis: No performance impact (~/~) + Absolutely no side effects (+/-)
  • Extensibility and configurability: Deep visibility into system, application and infra
  • Scale out across runtimes and scale up to many instances; challenges & limits
  • How do you design DDOS-mitigation/admission-control/fair sharing in this model of built-in service
  • Privacy and data sensitivity with Ops data analytics
  • Piecemeal analytics/security solutions –> Cloud analytics/security roadmap
  • Rules/annotators –> Actually smart analytics that learn
  • good and bad configs for security, performance, availability, etc.
  • Cross-silo analytics across Time, Space, Dev/Ops [CloudSight Dream]

Additional Points from Discussion:

  • Decoupling data for security and privacy concerns: Crawl system data and not user data. Give user the flexibility to choose what is exposed and what is not. We should find out better ways of controlling what is visible to any monalytcis system. Whether agent-based or agentless.
  • Is out-of-band approach really secure: good point on never enough security, and centralized point of entry. VMs, vs. containers vs. unikernels.
  • TBC

Docker DC ANNOUNCEMENT

On a somewhat unrelated note, Docker also released Docker DC  for on prem containers svc.

Today we are excited to announce the availability of Docker Datacenter (DDC), an integrated, end-to-end platform for application development and management at any scale. The enterprise-ready solution includes: Docker Universal Control Plane, Docker Trusted Registry and embedded support for Docker Engine and Swarm.

Interconnect General Session 3

My livenotes of InterConnect General Session 3:

Amazing video story of Blind long distance runner Simon w runkeeper. IBM Cloud powers this… Amazing!!
Founder & CEO of Runkeeper. Jason. Asics just acquired them.
US is 1/3 of runkeeper population. Data mgmt, building analytics on data. They are working w ibm on analytics and personalization.

CTO oF IBM CDS: Adam. prolly from cloudant as they were also boston.

Yes. they were cloudant. they were office neighbors. runkeeper and cloudant. runkeeper used cloudant for data mgmt.

They pulled 3M routes of users. and use that find popular segments and as rec wngine. Awesome example. data is king. used graphdb svc for this.

They did sth in munich w dashdb.
from routes to watson analytics, twitter, personality insights,…
runkeeper ceo said they wanna be a “trusted advisor”. i think we found Jims soulmate. Enphasizing data, personilzed svc.

Simon will go to namibia to run alone in desert. bmix garage will helps him.

——-

Phaedra (global lead, serious gaming, gamification) & David Connover: serious gaming. David is a HS teacher. teaches “serious game design”.

The game for adults, learning, cognitive. ibm cloud helps. Watson comes and interacts w game. They use BlueMix to stand up minecraft and interact w Watson APIs.

Phaedra:

– over 55 play more games than under 18

– Games help cure ptsd.

– simcity powered by watson. use histroical data that drive game events.

– partnering DoEdycation for EDSIM challenge.

——-

Watson experience from dev view:

Tanmay Bakshi 12yr old. He loves coding. he teaches programming on yourube. Started computing at K. and learned programming and last year learned watson and retrieve and rank svc. IBM Cloud Advisor grp found out about his work and they connected.
He also built asktanmay an “NLQA system”, so you can q&a and he searches web docs and it gives us results.


This was just a w e s o m e !

Watson is a great tool!
——-

VP marketing of IBM Cloud: Quincy Allen

1B users; 1.8Bimages sharwd every day

WhatsApp Deputy Prime Minister of Thruput: Rick Reed

57 engineers.

No ads, games, gimmicks

Very few meetings, quiet office, simple sw, minimal hw, JUE, JUC, focus ib sacalability, they use erlang, it is a dream to deploy and maintain.

They run on SL. Facebook also gave them world class infra. They make sure their stuff is horz scalable.

200 metrics/server/s

They started w dedicated bms, SL helped them customize their systems to their wkld need. Swapping ramsticksnfor them at 3am. More a bms story.

Key advise:

Simplicity and think scalability, just the basics, early enough.

———

joshua carr bba programming ex:

This i had read before when he published the blog, which was very cool. Use innova to do basic fmri and send to computerr and thwn tie to bmix and then control bba. Very cool, i had checked while back, just needs good training.

Very good demo and plea.

——-

Play at Work: 

IBM Fellow IoT: John Cohn (I think from TJW)

Father in law ibmer and wufe ibmer and he is ibmer 35yrs. Excellent motivating speaker.

dont suck at your kids high school presentation.

play at work is important.

1. keep that inner child

2. find some playmates


Boy and his atom 

samstones.org

burning man 20ft carousel

3. laws of play: put it out there. ask help.

4. be careful what you play w

5. do not worry so much

Bluemix, IoT

if it comes to apocalypse. go w cat food.

My career had only gotten better!

dont worry too much. find a way thru play….

This was inspirational as a reaearcher!

———

———

EcoSystem: Sandy Carter

ecosystem is more important than product. Innovate like startup.

Created BlueMix Garage method to act like a startup

dW arch center

IBM Watson AI XPrize 4.5M

only 1% devs embed cognitive today. 50% will in 2018.

New Announcement: bitly moves in to ibm cloud:

CEO Mark. All of bitly workload and links moves to IBM cloud.

12B clicks a month.

link shortwning –> link mgmt platofrm for enterprise incl. ibm.

8B click/mo on phone

CMU SDI/ISTC Seminar: Agentless, Near-realtime VM Introspection in Origami

One of my earlier talks on agentless, introspection-based monitoring for VMs. It is interesting how long we have been pushing on this, and how recent containerization business made these ideas resonate more. I just copied from the SDI/ISTC Seminar Series for nostalgia:

DATE: Thursday, January 31, 2013
TIME: Noon – 1:00 pm
PLACE: Gates 8102
SPEAKER: Canturk Isci, IBM TJ Watson

TITLE: Agentless, Near-realtime VM Introspection in Origami

ABSTRACT:
Enterprise data centers continue to embrace virtualization and cloud computing technologies due to their dramatic benefits for simplifying and streamlining system provisioning and management, as well as to improve overall resource use efficiency of the underlying hardware infrastructure via virtual machine (VM) consolidation and dynamic, distributed resource management. As virtualization and cloud automation has made it cheap and simple to create, deploy and recycle VMs on a large scale, modern data centers have become much more dynamic environments than they were a few years ago. This is creating new challenges for routine maintenance operations like security and compliance scans, that continue rely on a decades-old model based on in-VM monitoring agents and rule-driven analytics. When machines become transient and fungible resources, this model breaks down, resulting in increasing operational cost and risk.

In this talk we present a different approach to maintaining a large-scale dynamic data center. We show that out-of-VM monitoring agents can approach the data fidelity and real-time view as in-VM monitoring agents. Using this foundation we show how systems can be viewed as documents, enabling data mining and ad-hoc analytics techniques to be exploited for data center maintenance operations. These technologies are under development as part of the Origami system, an ambitious project to treat systems as data, and transform the way dynamic Cloud data centers are managed.

Notekeeping with WordPress

This was the first time i used WP for notekeeping in a conference. More recently i was flipping between Evernote, Notes and PlainText. I liked Evernote for being able to post anything, but the “upload everything after each edit” behavior was getting annoying as i was reaching my free quota fairly quickly. 

So i tried Notes and PlainText, but too many pics etc. make these unideal.  

Then i went to my latest love affair, Slack, and created a channel for the conf. That sounded like a good idea, but i the messaging platform view is really rough for this stuff. What is worse is, when i want modify txt, insert images, it is too difficult. So, obvious revelation: Slack is good for “append” and bad for “insert”:). No offense intended 😊

WP was just a trial. I did not expect it to be fast and easy to use, but it turns out a great way to do this thing. I took all my DockerCon notes live on WP on mobile and it was easy and fun. Hosting your own WP helps greatly, so i have control over who sees the content and how. 

One more great benefit: previously i would convert my Evernotes to html and post to my web for sharing. Now, all i do is point to my blog and folks can access while i type even. I would say this is good soln. The biggest enabler is the great, reliable mobile app👍.

DockerConEU-ClosingSession

Could not pay sufficient attention to all:

1. Container Migration – Mantika

2. Unikernel in docker – Anil…


 unikernel.org

3. Minecraft by docker folks

Very cool. ctrl containers from minecraft. we really needed this feature.

DockerConEU-ContainerTorture

jean-tiare From OVH,


talks about introspection, how to run binary incontainer. need to get charts. 

  

GRREAT SIMPLE description of what a container is and how you become one.


how to enter a container

setns, execv

What about host binaries:

easy -> patch; hard -> auto code rewrite;

“ptrace”

Trace, mess w process, interact w process (like gdb)

what he does:

run setns and ptrace


Very good talk on namespace jumping. very similar to what we are already doing w crawler w static binary. So: good /.

A lot more wxamples and demos. Get the video and slides here.

Code Also in github.

Protected: DockerConEU-Understanding Security

This content is password protected. To view it please enter your password below:

(Social Bookmarks + Social Share Counts) / Read It Later

In Research we have the chance evaluate our projects in multiple dimensions including science, image, product, open source, and service impact among others. I have participated in this process in various roles over the past few years, and have learned a lot of interesting things first as the outside reviewer of some of these projects and lately have been preparing the evaluations of my team projects.

Science Impact: Last year I was looking into the science impact of some of our projects, and it seemed rather archaic to go through some of our papers, and add them title by title and citation by citation to a spreadsheet or doc, which has been kind of the common practice and would look like something like this:

Screen Shot 2015-08-11 at 10.05.35 PM

It was relatively easy and painless to create a dynamic and online version of this with Google Scholar. I simply created a pseudo GMail user, and then searched and colected relevant references from different contributors of the project. It is a somewhat non-standard use of Scholar, because unlike my personal Scholar page that has ALL citations that ALL include ME, this was hand picked subset of papers from multiple authors. So there was some customization involved. The end result of this looked much better, and most importantly it is dynamic as it continues to update itself:

Scholar Page for DCEM ProjectI did not have much to do with this page since its creation, and it is pretty great to see how our has been evolving (or devolving?) over time. There is still some hands-on work needed, to add new contributions and contributors manually, to remove duplicates, etc. So it is not perfectly hands off.

But it could be… Here are some potential features I would have loved to have, or if they exist, I am unaware of:

  • Define Groups for Authors: It would be great to define groups for people. I am sure faculty would love this feature to track their group publications. With this feature, we could have simply created a group and have each project member join the group from their scholar pages (maybe even have a time range, but that is kind of flaky, because of the lag in publication times).
  • Define Project Tags for Papers: It would also be awesome to have tags for papers, so we could match them with projects, groups, etc. With both options, we could have simply created a group+project profile by specifying which group and what tags, and the intersection would autofill.

It does not escape me that this requires some curation, and i could always extend with the same manual steps I did above, but with some incentive I could see this working. I am also sure Google would be smart enough to auto-add/propose tags/groups for us.

Image/Social Impact: My main intentions for this was not to write about science impact and Scholar—that was an obvious, straightforward solution. At least the details above segways to what’s next. This year I was looking into three projects and their image/social impact. Just as above, I had no intentions of making a static spreadsheet. And it seems fairly obvious that I was looking into a something like a Scholar for web content, like blogs, press releases. I was hoping a read-it-later service like Pocket—one of my favorite tools—would take care of business here, or some social bookmarking would work. To my surprise, I cannot seem to find the right tool that fits the purpose here. I ended up using Google Sheets for this and some custom Apps Scripts, based on what I learned from these two folks:

I started with s spreadsheet from the first blog, and changed some queries based on the second one. With a few trials and errors I had a simple solution that looks like this:

Screen Shot 2015-08-11 at 10.46.09 PMYou simply paste a URL and almost everything is autogenerated from there on, except for one thing: The title (Well, in reality when & where as well, but those are not really critical for me). It did not occur to me, but title/excerpt is one piece that is not trivial, based on different contents. Ideally, I think it would be very nice to have something like Pocket’s view, and at some point I might revisit this for such an experiment (enjoy the shameless plug on the selected 1st and 3rd example titles;)):

Screen Shot 2015-08-11 at 10.53.57 PMAbove spreadsheet solved my problem, but is far from effortless, and far from pretty. I am surprized to see i cannot find a good solution that combines social share counting and read it later/social bookmarks. I know great add-ons like Social Share Counter and various web services exist for counting, but none that seems to have above collection, categorization, aggregation features.

I would like to pose this as a feature request for Pocket: Please consider adding social share counts for bookmarks, and aggregations for tags/lists. Please also feel free to set me straight with any existing potential candidates for this.

 

Selenoid Blog

It turns out I somehow end up working on image management whether at work or at home. At work, VM image management, and at home, JPEG image management… I recently created another blog called Selenoid for the home part here:

SelenoidWelcomePageAs it says on the landing page, the goal was to create a camera roll for our pictures, and it is only fair to call is “Selenoid” since 99% of our pictures feature Selen.  The main motivation was somewhat more than just a place to put our pictures. What instigated this was, a couple of activities with bunch of other friends, and a whole bunch of 10MB emails flying back and forth with group pictures. Instead of emailing each other a bunch of pics, it seemed a good idea to have a single sink point where everyone can send or upload theirs, and can see other pics. All the prettying and gallerying automatically done for you, and you can bulk download all. After playing with a few album tools, I ended with this blog platform.

The requirements were pretty simple at first (later on, it turned out to be more, but we’ll come to that):

  • Simple to upload
  • Easy to download
  • Nice enough to view online
  • Commenting/captioning enabled
  • Private

For simple, since everyone was already emailing things around, it seemed straightforward to just use the same flow using post via email from Jetpack. Easy to download is fairly easy with the media library, the blog format is reasonably easy on the eye, and it is natural to include comments. For private, I used the “Private Only” plugin, which works nicely. Now anything besides the landing page brought up the login page:

SelenoidLoginPosting pictures required no access, as long as you had the special email address, but viewing needed login. It seemed all was good under the hood at this point, and experimentation started. After a few iterations two new requirements became obvious:

  • Categorization
  • Multitenancy

The first one is obvious, and fairly straightforwad. The landing page describes the details of how to push categories during post-via-email. Multitenancy is a tough one and needs explaining. Once you start pulling in posts and categories from multiple events, you want to isolate which tenants see which content. There are a bunch of plugins that provide this and CRM like frameworks. I have used “Role Scoper” which is reasonably fine grain, and works well. It was, however, not fun to deal with setting roles with each individual user and thinking how this scales out. Setting roles, also somehow started to break post via email, so eventually, we ran out of playtime here. It is quite amusing multitenancy gives me grief both at work and at home. Right now, we have roles scoped, but disabled by default. So I defer strong isolation for simplicity. Here is an example Roll from a few pushes via email. Reasonable end result (captured as WYSIWYG via FireShot) i think with minimal effort:

SelenoidPageExThis was a fun experience, and I learned quite a bit from the experimentation. Here is a list of all the packages I ended up using for the final version:

– Akismet
– Confirm User Registration
– Jetpack
– MOJO
– Private Only
– Private Only, Disable Feed
– Role Scoper [Disabled]

We dont use this service much nowadays as shared iCloud Streams make it so easy, but still, not everyone has iOS devices, and there has been quite a few instances where email-to-Selenoid proved to be quite useful. I hope to revisit multitenancy at a later time and update this with a proper service with multi-user isolation.

My First Email Post and What I Learned about Posting via Email

This is my first post via email. Which originally said “Hello, this is my first email post..“. [Published on: Sep 28, 2014 @ 23:02]

 

Here is what I learned over time:

I am using JetPack’s Post via Email, which needs an actual WordPress account to connect to WordPress and to create a xxxxxxx@post.wordpress.com email. Once you generate this secret email, you can post by emailing to that address.
Here is the info: http://jetpack.me/support/post-by-email/

Next I wanted to add a custom email that I can use, so I used a forwarding service on top. Forwarding service works fine for text updates and small images (~100KB), but fails with bigger images. The secret WordPress email still works with bigger images (3-4MB)

Want to give postie a shot sometime as well:
http://www.makeuseof.com/tag/email-blog-updates-wordpress-blog-postie/

Will update as I go