The SAMAR Project by French Business Consortium 'Cap Digital'
TEMIS Text Analytics Technology Automates the Analysis of Arabic Language Information
NEW YORK, March 16, 2010/PRNewswire/ -- TEMIS, the leader in Text Analytics solutions for the Enterprise, today announced it is playing an active role in the SAMAR Project, a government funded multimedia content enrichment initiative of Cap Digital, the French consortium for digital content research and development.
Low volume of Arabic language content in North Africa
The information industry is still developing in North African countries. The volume of content authored in Arabic language is low. Newspapers play a key role in the development of the Arabic language internet, accounting for 40% of its content. However, the production of content in Arabic faces steady demand with the growth of Arabic speaking web users from these countries. Outside North Africa, press agencies are trying to increase their range of Arabic information sources.
Opening new horizons for Arabic content
The SAMAR project was initiated by Agence France-Presse (AFP), the Paris-based international news service. AFP has plans to expand its online information portal to multilingual content including Arabic. Arabic language structure is extremely complex and current technologies do not allow for an optimal semantic tagging. It is also complex to connect Arabic content to information in other languages.
A semantic analysis is necessary to index Arabic language content and make it accessible and findable through online search.
SAMAR, the platform for Arabic multimedia information management
The SAMAR project team will apply new technology to the wealth of Arabic language output of AFP, around 1 million news articles totalling about 150 million words, as well as to a wide set of radio and TV multimedia channels.
The Arabic language challenge