Eric Pugh's complete blog can be found at: http://www.opensourceconnections.com/author/epugh/

Items:   1 to 5 of 24   Next »

Friday, March 2, 2012

After a couple of exhausting days at HIMSS last week, where I was overwhelmed both by being a first time visitor to the glitz of Las Vegas and the sheer size of the HIMSS exhibit hall, I’ve come to the conclusion that HealthIT is currently running on two separate parallel tracks.

The first track is the effort to integrate data. I first became aware of the mishmash of XML and EDI based standards back in 2008 when popHealth was launched. I remember seeing a printout of a HL7 formatted message where the machine readable XML encoded data was done in green, and the older proprietary formats, wrapped in some XML tags, was done in red. Most of the document was red, very little was in green signifying that the data was interoperable between multiple EHR systems.

Fast forward to the 2012 HIMSS conference and it turns out we as an industry are still trying to solve this problem. I met a number of vendors such as PilotFish and Meddius who offered products for transferring documents between flavors of HL7 and into and out of proprietary formats. The sessions by the Veterans Administration and Department of Defence on merging their health records into a “Integrated Electronic Healthrecord” or iEHR was some of the most interesting sessions, and demonstrated both the challenges of integrating EHR’s and the the benefits of having a single holistic view of a patient, regardless of where their health records exist. GovHealthIT has a great background article on this effort.

However, I feel like, despite all the money, and the vast improvement to patient outcomes that will come from integrating various EHR’s, we are still just ensuring that if the patient needs a leg amputated, we know which leg is the right one! All this massive IT “plumbing” is just introducing efficiencies in shuffling data around and therefore reducing errors. The second track that I saw at HIMSS is the one where we actually try and change patient outcomes. Through technology, we ensure that the patient doesn’t even need to have a leg to be amputated because we know which treatments they will respond best to. Companies like Apixio are assuming that this IT plumbing has been solved, and the real value in changing patient outcomes is looking at all the data available about a patient, both structured and unstructured, and coming up with better treatment options. If you are looking at one type of treatment, can you compare the efficacy against the larger populate of similar folks to validate it is the right one?

This is the really exciting area to work in. Using the principles of big data, cloud computing, and data anlytics to make sense of the massive amounts of uncoded data available to clinicians so they can make better decisions. Hopefully at HIMSS 2013 we’ll see more sessions on “How we improved patient outcomes via data analytics” and fewer on “how we tried to merge disparate data together”. I think the vast increase in information has driven this major change in medicine:

As technologies such as entity extraction, natural language processing, and other text mining tools progress, we may end up seeing the tension between those who want all the data to be highly structured, and those who say “give me your unstructured data, I’ll figure out what the structure is” disappear. Images become OCR text. Raw text becomes entity extracted structured data. And patient outcomes improve.


Wednesday, April 27, 2011

I recently had an IRC conversation about Solr 4.0. The main question that the person who was chatting with me had was “How far out is the 4.0 release?” The answer, as with almost any open source project, is “when it’s released.”
Naturally, that answer doesn’t really help get to the crux of what most IT teams who either use or are considering Solr need to figure out, which is whether 4.0 is stable enough to deploy in a live environment.
Solr, even in unrelated versions, has historically been pretty stable. So, if a new version, in this case 4.0, has the functions that you’re looking for – in this conversation, it was function queries like idf() or termfreq() – then unless you’re comfortable with compiling a previous version of Solr and creating your own code on top of it, then you’re probably going to want to go with the latest version.
Of course, this approach does come with risk. I have only heard of 1 actual “bug” that led to incorrect/wrong results sneaking into the Solr code base in an unreleased project, and it was quickly found and fixed. But, since you’re working on a code base which may change somewhat, if you are building indexes that you can not easily rebuild, for example, indexing the Internet and can’t recrawl to generate the data – meaning if Solr is your “system of record”, then be aware that over time the index file format may change because Lucene is changing under the covers and periodically there is an email that tells you that you need to rebuild your indexes. But, if you are basically taking a download of Solr 4.0 as it is today, and then only going to update a) when new killer awesome feature added or b) when 4.0 comes out, then reindexing shouldn’t be a problem.
The other aspect of deploying Solr 4.0 is your testing environment. If you have strong system and functional testing, then you can be fairly sure that things are working appropriately. If you’re not certain about testing, check out my presentation on Better Search Engine Testing from this year’s Software Test and Performance Conference.

Friday, March 4, 2011

I’m very excited to be returning to another edition of STPCon, this year in Nashville!

This year I will be unveiling a new talk: Better Search Engine Testing, as well as a reprise of Turbocharge your Automated Tests with CI from STPCon 2009.

Since this is the first time for my Better Search Engine Testing talk, I’d love to connect with other folks in the testing community who’ve worked with search engines and get some feedback on my talk. Just hit me up at http://twitter.com/dep4b!

Check the STPCon website for more details on this great conference: www.stpcon.com


Wednesday, February 23, 2011

We’ve been talking to our customers to understand why they are adopting Solr. Is it because it’s a great product from the Apache Software Foundation? Is it because Lucene has a cute naming story? Is it because of Solr’s compelling feature set, ease of use, and great admin interface?

Unfortunately, it’s none of those reasons. The top three reasons for adopting Solr are:

  • Vendor Fatigue. Companies are tired of being tied up in complex and expensive licensing contracts, per document pricing schemes, and opaque service agreements. This Dilbert comic I saw in the paper today struck very closely:
    Dilbert.com
    This is why our basic services agreement is a 1 pager!
  • Flexibility. We’ve worked with a number of clients who’ve used Google Search Appliance and Google Custom Search because they were so gosh darn simple to setup. They subsequently moved to Solr because they either ran into the limitations for indexing information or presenting a rich search user experience with those solutions. With Solr, you can integrate with any datasource, and display your search results in whatever format you need. The more “seamless” you want your search experience to be with your site, the better Solr looks.
  • Stability. As much as I love Lucene, it’s a very complex beast. And many organizations have developed their own search engines on top of Lucene that have become expensive to manage, and don’t offer the features that Solr does. We’re going to be busy for a long time helping companies move from pure Lucene based search to Solr!

Thursday, September 30, 2010

My article on the role of Continuous Integration as helping be the bridge between the Developer and Automated Test Writer who continually engage in a Tug of War was published in the September issue of Automated Software Testing Magazine!  It’s been a couple of years since I wrote an article for a magazine, and this was my first time writing for a real honest-to-god paper magazine.  It was a great experience, and I’m hoping it helps developers and testers start to have a dialogue of how to work smoothly together.


Items:   1 to 5 of 24   Next »