Archive for September, 2009

Fall of Mal-Ads: Microsoft to sue five unknown individuals

Wednesday, September 30th, 2009

Malicious Ads have troubled the online advertising industry in the past, and continue to trouble again. Yes, this time it’s Microsoft.

Microsoft Corp. has filed five lawsuits against five unknown individuals under the business names Soft Solutions, DirectAd, qiweroqw.com, ITmeter INC. and ote2008.info in Washington state court this Thursday. Microsoft is asking a court for damages and a court order to stop these advertisers from operating normally.

What are Malicious Ads?

The term “malware,” derived from “malicious software,” refers to any software specifically designed to harm a computer or the software it’s running.Malware can be installed on a computer, with or without your knowledge, in a number of ways—usually when you visit a contaminated website or download seemingly innocent software. Malware ads are an attempt to use advertising to distribute malware by including malicious code inside the ad creative.

How does Malicious Ads behave?

Its very hard to predict the bahaviour of the malicious ad. The look and feel of the ad would be normal most of the times, but sometimes it would act funny by showing blank ads or inappropriate ad, making it easy to identify. Some known behaviours are as follows:

  1. The malicious ads on a website would not load or if loaded,would disorient the layout of the website.
  2. As soon as the malicious ad load, for instance an SWF/flash based ad creative, they would crash the browser.
  3. The malicious ads would load an unapproved or unauthorized ads. Hard for the user to recognize until clicked to redirect to a wrong website.
  4. The malicious ads if clicked, would lead to download spyware  or would lead to much more privacy damaging issues.

What harm can it do?

Just like any other malware, malicious ads, if allowed, would harm the computer, this could just be the user experience or harming the whole software system.

General Tips

  • Be careful of what you download
  • Keep your system, browser, anti-virus and spyware prevention software updated regularly.
  • Keep up with the internet subscribe to news groups or newsletters related to internet security

What to do if you suspect Malware in your system?

  • Stop all activity connected with online financial activity, passwords and other sensitive information.
  • Run a software virus and spyware update to have the latest definitions avaialable and installed.
  • Once you confirm that your security software is up-to-date, run it to scan your computer for viruses and spyware.
  • If the problem persists after you diagnose and treat it locally, consult an expert.
  • Once your computer is back up and running normally, think about how malware could have been downloaded to your machine, and what you could do to avoid it in the future.
  • Finally, always monitor your computer for unusual behavior.

What to do if you suspect a certain website is displaying malware ads?

  • If you suspect the website to display a malware ad. Do report it to the website contact or raise an alarm in the website forum or blog comments, for further investigation by the website team/webmaster.
  • Block the website from being accessed on your computer by adding it to the unsafe list till the threat is cleared.
  • Run antivirus and spyware checks and check for unusual changes.

More on this:

Recently a top Google executive, Eric Davis, the head of anti-malvertising at Google, speaking  at the Virus Bulletin 09 conference  said that “the company can’t do it alone  to prevent malicious ads from finding their way onto search result pages and other Web sites. Instead, he suggested that an industry-wide coalition comprising ISPs and other concerned parties would ring in a major effect on the epidemic of malicious ads.”

Malicious ads turns out to be an industry problem as a whole, Is there a permanent solution to this problem? We should wait and watch. In the mean time take a look at the Google’s anti-malvertising.com website, which seems to be the first step against this epidemic.

Read more at MSNBC.

  • Share/Bookmark

Datamining Conferences Updates

Wednesday, September 30th, 2009

Some Upcoming DataMining Conferences:

1]  ICDM’09: The 9th IEEE International Conference on Data Mining

Dec 6-9, 2009, Miami, FL, USA.

The 2009 edition of the IEEE International Conference on Data Mining series (ICDM 2009) will be held in Miami, FL,USA, on December 6 thru 9, 2009. The International Conference on Data Mining series (ICDM) is well established as a top ranked research conference in data mining, providing a premier forum for presentation of original research results, as well as exchange and dissemination of innovative, practical development experiences. The conference covers all aspects of data mining, including algorithms, software and systems, and applications. In addition, ICDM draws researchers and application developers from a wide range of data mining related areas such as statistics, machine learning, pattern recognition, databases and data warehousing, data visualization, knowledge-based systems, and high performance computing. By promoting novel, high quality research findings, and innovative solutions to challenging data mining problems, the conference seeks to continuously advance the state-of-the-art in data mining. Besides the technical program, the conference will feature workshops, tutorials, panels and, new for this year, the ICDM data mining contest.

2]  SIAM Conference on Data Mining

April 29 – May 1, 2010 Columbus, Ohio

Data mining is an important tool in science, engineering, industrial processes, healthcare, business, and medicine. The datasets in these fields are large, complex, and often noisy.  Extracting knowledge requires the use of sophisticated, high-performance and principled analysis techniques and algorithms, based on sound theoretical and statistical foundations. These techniques in turn require powerful visualization technologies; implementations that must be carefully tuned for performance; software systems that are usable by scientists, engineers, and physicians as well as researchers; and infrastructures that support them. This conference provides a venue for researchers who are addressing these problems to present their work in a peer-reviewed forum. It also provides an ideal setting for graduate students and others new to the field to learn about cutting-edge research by hearing outstanding invited speakers and attending tutorials (included with conference registration). A set of focused workshops are also held on the last day of the conference. The proceedings of the conference are published in archival form, and are also made available on the SIAM web site.

3]  The 27th International Conference on Machine Learning (ICML 2010)

21 – 24 June 2010, Haifa, Israel

The 27th International Conference on Machine Learning (ICML 2010) will be held in Haifa, Israel on June 21-24, 2010. ICML is the leading international machine learning conference, attracting annually some 500 participants from all over the world. ICML is supported by the International Machine Learning Society (IMLS).

4] 10th Industrial Conference on Data Mining ICDM-2010


July 12 – 14, 2010, Berlin/Germany

The Industrial Conference on Data Mining ICDM is held on yearly basis. Researchers from all over the world will present theoretical and application-oriented topics on Data Mining. Practicioners can present and discuss their ongoing projects in Industry Sessions. Industrial Exhibition · Best-Paper-Award for Talks and Posters · Workshops: DM in Life Sciences DMLS, Case-Based Reasoning on Multimedia Data CBR-MD, and DM in Marketing DMM

5] KDD-2010

July 25-28 2010, Washington, D.C., USA.

The annual ACM SIGKDD conference is the premier international forum for data mining researchers and practitioners from academia, industry, and government to share their ideas, research results and experiences. KDD-2010 will feature keynote presentations, oral paper presentations, poster sessions, workshops, tutorials, panels, exhibits, demonstrations, and the KDD Cup competition. KDD-2010 will run between from July 25-28 in Washington, DC and will feature hundreds of practitioners and academic data miners converging on the one location.

More:

Abbr. Name Location Submission

Acceptance Camera-ready Begins Ends
KES2010 14th International Conference on Knowledge-Based and Intelligent Information & Engineering Systems Cardiff, UK 2010-03-01 2010-04-01 2010-05-01 2010-09-08 2010-09-10
PAKDD 2010 The 14th Pacific-Asia Conference on Knowledge Discovery and Data Mining Hyderabad, India 2009-11-30 2010-01-30 2010-02-28 2010-06-21 2010-06-24
LIS 2010 Learning in Intelligent Systems (Special Track at FLAIRS 23) Daytona, Florida, USA 2009-11-23 2010-01-22 2010-02-22 2010-05-19 2010-05-21
SDM 2010 SIAM International Conference on Data Mining Columbus, OH, USA 2009-10-09 2009-12-15 Unknown or ASAP 2010-04-29 2010-05-01
AIA 2010 The Tenth IASTED International Conference on Artificial Intelligence and Applications Innsbruck, Austria 2009-09-15 2009-11-01 2009-11-15 2010-02-15 2010-02-17
DS SAC 2010 Data Streams Track (ACM SAC 2010) Sierre, Switzerland 2009-09-08 2009-10-19 2009-11-02 2010-03-21 2010-03-26
DM SAC 2010 Special Track on Data Mining (ACM SAC 2010) Sierre, Switzerland 2009-09-08 2009-10-19 2009-11-02 2010-03-21 2010-03-26
AusDM’09 Eighth Australasian Data Mining Conference Melbourne, Australia 2009-07-31 2009-09-11 2009-09-25 2009-12-01 2009-12-04
ICAART 2010 2nd International Conference on Agents and Artificial Intelligence Valencia, Spain 2009-07-28 2009-10-14 2009-10-28 2010-01-22 2010-01-2

Read more at kmining Conference Info.

Also check out there Graph edge crossing minimization game.

  • Share/Bookmark

Structured Data using AlchemyAPI

Saturday, September 26th, 2009

Orchestr8  are developers of semantic tagging and text mining software, announced a new technology, AlchemyAPI.  Their AlchemyAPI service is a web-based tool that does automated tagging, categorization, semantic analysis, and text mining.

“Our AlchemyPoint mashup engine understands “structured” data residing within web pages and uses this information to expose a whole world of related online content. It provides advanced content manipulation and “mashup” capabilities that enable websites to be manipulated and combined with other Internet content. Pages can be visually remixed, content reformatted, shared, and more!” – www.orch8.net

Who is this Orchestr8?

Orchestr8 is based in Denver, Colorado. We have been in operation since 2005.They are the leading providers for semantic tagging and text mining solutions.

Now complete with “visual constraints” Orchestr8 says AlchemyAPI can extract structured data from web pages as well by using ‘natural language’ querie

Constituents of Alchemy API

1. Named Entity Extraction

2. Concept Tagging/ Text Extraction

3. Text Categorization

4. Automatic language identification

5. Text Extraction/Web page cleaning.

6. Structured data extraction / Content Scraping
Now lets  deep dive to see what these indivudual modules mean in terms of performance..

Named Entity Extraction:

This includes the following functions:

Identify people, companies, organizations, cities, geographic features, and other typed entities within HTML pages, text documents/content, and scanned document images.

This offers disambiguation that is not found in other similar solutions.

For more information on using AlchemyAPI for named entity extraction, click here.

Term Extraction:

This includes the following functions:

Extract important terms and “topic” keywords from HTML pages, text documents/content, and scanned document images.

Advanced statistical and linguistical algorithms are used for this purpose.

For more information on using AlchemyAPI for automatic keyword / term extraction, click here.

Automatic Language Identification:

It determines the language in which the web-based content was written in. AlchemyAPI’s language identification capability is the most robust in the industry today, supporting 97 different languages.

Content Scrapping:

AlchemyAPI provides a structured data / content scraping capability, capable of extracting structured data (prices, product descriptions, etc.) from any web page. Employing advanced visual constraints, AlchemyAPI enables structured data to be extracted based on visual and structural traits, such as text labels, positioning, and more.

Visual constraints:

Visual constraints is the new content mining technology. This is a mojot laeap over technologies such as XPath. This technology transforms any web page into structured information that may be interactively queried, scraped or even converted into Semantic Web Content like RDF.
Check  more about this  at http://www.alchemyapi.com

The image of the AlchemyAPI layout can be found in the gallery for your reference.

Blog Gallery by Picturesurf

-Vidhya, Student Intern

  • Share/Bookmark

Oracle’s new warehouse: the second Exadata machine

Saturday, September 26th, 2009

Oracle chief Larry Ellison unveiled the latest version of its Exadata data warehousing appliance.

The release had three primary goal:

  • Knock data warehousing rivals Netezza and Teradata;
  • Show that Oracle wasn’t going to let IBM punch Sun anymore;
  • And illustrate some of the logic behind the Sun acquisition

Ellisonpage

What is so special about this Exadata warehouse?
Sun Oracle
Oracle Exadata Storage Servers combine smart storage software from Oracle and industry-standard hardware from Sun to deliver the industry’s highest database storage performance. To overcome the limitations of conventional storage, Oracle Exadata Storage Servers use a massively parallel architecture to dramatically increase data bandwidth between the database server and storage. In addition, smart storage software offloads data-intensive query processing from Oracle Database 11g servers and does the query processing closer to the data. The result is faster parallel data processing and less data movement through higher bandwidth connections. This massively parallel architecture also offers linear scalability and mission-critical reliability.

Notable benefits:

  • Extreme Performance for Data Warehouses
  • Extreme Performance for OLTP Applications
  • Extreme Performance for Mixed Workloads

How is this high performance a reality?

Now with the newest release of the Oracle Exadata Storage Server, you can also achieve extreme performance for transaction processing and consolidated mixed application workloads. Exadata Smart Flash Cache addresses the disk random I/O bottleneck problem by transparently moving hot data to Sun FlashFire cards. You get ten times faster I/O response time and use ten times fewer disks for business applications from Oracle and third-party providers.

The new memory layer (Processor Cache’s -> DRAM -> Flash Cache -> Disk) coupled with Oracle’s algorithms to effectively use the Flash Cache layer brings performance benefit to the solution (+ all the other improvements 12 months of hardware innovation brings, faster CPU’s, more memory etc).

What does it actually have?

The Oracle Exadata Storage Server (Data Sheet, PDF):

  • 2U Storage “unit” with either 1 TB SAS or 3.3 TB SATA redundant capacity. There is a query processor in the box that can “offload” tasks from the main database server. Primary filtering, decompression, joins, backups.
  • Storage units linked to database servers via dual Infiniband offering 20 Gbit/s (2.5 GBytes/sec) bandwidth.

Some interesting aspects of the Oracle Exadata Storage server.

Performance

The data sheet presents two options: 1 TB with SAS with 1000 MB/s bandwidth; or 3.3 TB with SATA and 750 MB/sec. Compression is “extra”, meaning in a typical data warehouse you get 2-3 times compression, meaning your actual bandwidth will be 2000-3000 MB/sec from a single Exadata server.

Redundancy

Mirroring is provided by ASM (either 2- or 3-way). It is also performed across Exadata storage servers.

Disk failure does not abort queries or transaction.

Exadata Storage server does abort queries or transactions, but with no data loss.

Manageability

There’s a plug-in available for 10g Enterprise Manager, a GUI to manage all that.

For more:

Check this blog for further details.

-Vidhya, Student Intern

  • Share/Bookmark