FOLIO: mediaPRO

Magazine & eMedia Publishing Professional & Social Network

At Robin's suggestion, I have re-posted this discussion in this group. The original conversation can be found in the e-media group here.


I am working to implement an automatic/semi-automatic tagging and taxonomy assignment product in the near future, and would like to know of other member's experiences with this type of software. I am especially interested in any potential "gotchas" or issues to keep an eye out for. Also of interest would be any recommendations regarding vendors - we are talking to Nstein, Temis, Inform and a few others at the moment.

A little background data - we are a medium-sized B2B publisher with about 40 publications and generate a substantial amount of content on a daily basis, mostly articles, but also video, images, podcasts, etc. I am primarily concerned with tagging articles.

Tags: aggregation, content, tagging, taxonomy

Share

Reply to This

Replies to This Discussion

There are about 70 vendors out there, I've attached a spreadsheet I put together for a U.S. government contract; it is just bare bones. The big divide in vendors is whether the software is linguistic based and whether it requires "training sets" of documents. In my experience, I want some linguistic element and training sets are a lot of trouble.
Attachments:

Reply to This

Thank you very much, this is very helpful!

Reply to This

David, how are you coming along with this project?

Robin Sherman
Editorial & Design Services

For Books, Journals, Magazines, Manuals, Newsletters, Internet

— Information Architecture, Content Development, Organization, Improvement
— Developmental, Substantive, Copy Editing
— Publication Critiques
— Publication Design
— Typesetting, Typography, Layout
— Speaker, Seminars, Workshops

Small Publisher and Non-Profit Specialist

Reply to This

Hi David,

I work at Nstein, so my 5 cents might be a little bit biased and you also might have heard what I'm going to say already. In any case, having this message in the public could be helpful for someone else.

There are 2 types of text mining systems out there - hosted/perpetual license and SaaS. (Nstein, Temis is license based, Inform is SaaS).

The main difference is owning the metadata. While having to pay a smaller fee out of your operational budget might sound very seductive, in the end, when going with SaaS you do not control your metadata, your taxonomy, your dictionaries, etc.

It's like maintaining your custom made, heavily modded and beloved million dollar car in a "Joe Shmoe" car shop. The point of text mining is to fine tune access to your content both on your front and back end. When you don't control your taxonomy, entity lists, dictionaries and you don't have the meta-data in house - you are just tapping the top of the top of the iceberg of possibilities, while paying more over a 3 year period of SaaS subscription than you would pay for owning a licensed and fully controlled text mining engine.

Second point I wanted to make is importance of owning, and dynamically controlling your taxonomy (-ies). Who knows what's going to happen in your industry/area. Imagine being able to suddenly adjust and deepen your taxonomy for a newspaper in Virginia area just when the shooting happened. How much more traffic could you retain on your site?
Or imagine having one general taxonomy for your overall content, and micro-taxonomies to answer special projects, micro-sites, etc. Or having a separate taxonomy for every of your syndication partners, thus providing them with enriched content (chewed and digested for them) - that could greatly increase your syndication revenues.

Thirdly - relevant content linking. When your text mining engine does not have access live to your DAM or other repositories, it cannot provide you with relevant content links between your regular content, your UGC, your partners' content, etc.
What's the point? One of the most successful features of text mining is all those hundreds of relevant links and inline-tags that that keep that user on your site for 30 minutes, instead of 30 seconds.

Below I'm attaching a visual example of all the features, properly installed and configured text mining engine can bring to your side, to your revenues, to your bottom and top line in a completely automated fashion.

Reply to This

RSS

Sign in

E-mail

Password

Latest Activity

Beverly Post and Kylie Gonzales are now friends
5 minutes ago
nikhil vadhva, claim b Mane, macau h patil and 5 more joined FOLIO: mediaPRO
4 hours ago
5 hours ago
9 hours ago

Groups

Help Us Grow

Please Invite your co-workers & friends to join your network. They'll automatically be added to your Friends List. Click Now

Member Search

Search member profiles by keyword, company & more  

Ex: Chicago, "Penton Media"
Advanced Search

Badge

Loading…
Commercial Use Limitations: Use of any content features (blogs, forums, messaging, etc) for direct self-promotion, spamming, etc. will result in account termination. Profiles are for individuals only at this time, not companies. Profile headshots should not include company logos. Publishing/Media companies (non vendors) may create groups for their employees. Vendors see this post for more information.

© 2009   Created by FOLIO MediaPRO Team

Badges  |  Report an Issue  |  Privacy  |  Terms of Service