TechWatch
10.1K views | +0 today
Follow
TechWatch
Monitoring innovations in hardware, software and vaporware
Curated by Nicolas Weil
Your new post is loading...
Your new post is loading...
Scooped by Nicolas Weil
Scoop.it!

Microsoft Switches to Open Source Servers, Steps Into the Future

Microsoft Switches to Open Source Servers, Steps Into the Future | TechWatch | Scoop.it
Today, Microsoft will not only lift the veil from its secret server designs. It will “open source” these designs, sharing them with the world at large so that other online outfits can use them inside their own data centers.
No comment yet.
Scooped by Nicolas Weil
Scoop.it!

Why Netflix's CDN should scare the storage industry

Why Netflix's CDN should scare the storage industry | TechWatch | Scoop.it

Lest storage vendors thought they were immune to disruption that open source hardware is having on the server industry, Netflix’s new Open Connect content-delivery network might make them think again. While Open Connect directly targets commercial CDNs, it’s based upon (or at least inspired by) open source designs first released by Backblaze almost three years ago. Backblaze’s design evolving and expanding its range into the data centers of a Fortune 1000 company is significant in the same way the evolution of modern man was for neanderthals.

No comment yet.
Scooped by Nicolas Weil
Scoop.it!

How We Built A Data Center With Commodity Hardware And FOSS

How We Built A Data Center With Commodity Hardware And FOSS | TechWatch | Scoop.it

We are a startup named QuickoLabs based out of Bangalore, India. Our product SearchEnabler, is on-demand SEO software which crawls and analyzes user’s website to provide recommendations, helping them improve their website ranking in search engine results.

 

Our goal is to make SEO easy, affordable & measurable for start-ups and small businesses. To realize our goal, we wanted to ensure minimum cost is incurred in our operations without compromising on product capability.

 

Today our infrastructure holds more than 8TB of data collected from web and processes nearly 250 GB of data everyday. It consists of more than 700 Million unique URLs and analyzed more than 35 million webpages. This numbers will grow quickly as customer base increases.

 

Our infrastructure currently manages:

2 Applications Servers
5 Cassandra Nodes
4 Task Trackers
9 Data Nodes

 

We have used following open source software’s to setup 24×7 crawling, distributed storage and processing :

- Hadoop HDFS

- Cassandra NoSQL Storage

- Hadoop Map-Reduce Tasks

- Pig Scripts

- Zookeeper for co-ordination

- Apache Nutch Search Engine

- Text Processing Through Lucene

No comment yet.