Reading TCX in Haskell
I use a Garmin GPS/heart-rate monitor watch to track my running. I also upload this GPS data to a service called RunKeeper to keep a history of my runs. While I’ve generally been happy with RunKeeper, my experience uploading Garmin GPS data to RunKeeper has been less than stellar. My biggest complaint is that usually the original GPS data changes significantly when uploaded to RunKeeper. For example, a 10.0 km run (according to Garmin) can become 10.2 km in RunKeeper.
I decided to do a bit of data mining on Garmin GPS files to figure out how RunKeeper interprets it differently. Garmin’s tools can export GPS data as both GPX and TCX. The latter format is developed by Garmin and quite likely the closest match to the their native format. Both GPX and TCX are XML.
I didn’t get very far with actual GPS track analysis but I did write a TCX file reader in Haskell using the Haskell XML Toolbox (hxt) library. As there seems to be a bit of a lack of Haskell XML parsing examples on the Internet, I decided to post my TCX reader here as an example of parsing Real World XML data in Haskell.
Here’s a short sample of what a TCX file looks like (some elements have been omitted and xmlns URLs truncated for brevity):
<?xml version="1.0" encoding="UTF-8"?>
<!-- Some elements omitted for brevity -->
TrainingCenterDatabase
< xsi:schemaLocation="http://www.garmin.com/xmlschemas/...d"
xmlns:ns5="http://www.garmin.com/xmlschemas/ActivityGoals/v1"
xmlns:ns3="http://www.garmin.com/xmlschemas/ActivityExtension/v2"
xmlns:ns2="http://www.garmin.com/xmlschemas/UserProfile/v2"
xmlns="http://www.garmin.com/xmlschemas/TrainingCenterDatabase/v2"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:ns4="...">
Activities>
<Activity Sport="Biking">
<Id>2012-12-22T13:47:50.000Z</Id>
<Lap StartTime="2012-12-22T13:47:50.000Z">
<DistanceMeters>1000.0</DistanceMeters>
<Track>
<Trackpoint>
<Time>2012-12-22T13:47:49.000Z</Time>
<AltitudeMeters>-2.799999952316284</AltitudeMeters>
<DistanceMeters>0.0</DistanceMeters>
<HeartRateBpm>
<Value>134</Value>
<HeartRateBpm> </
The Haskell code for TCX file reading can be found below (also up on github with a .cabal file). Here’s an outline of what it does:
- The input XML document is read from a file called
test-act.tcx
- The XML document is massaged into Haskell objects using HXT combinators
- The resulting Haskell objects (
Activity
,Lap
andTrackpoint
) are traversed and output to stdout
Note: As is probably obvious, this code is not meant to be comprehensive library for accessing TCX data – it’s really just an example of how to get started with HXT and TCX.
{-# LANGUAGE Arrows, NoMonomorphismRestriction #-}
import Text.XML.HXT.Core
import Data.Time (UTCTime, readTime)
import System.Locale (defaultTimeLocale)
data Activity = Activity [Lap]
deriving (Show)
data Lap = Lap {
lapDistance :: Float
lapTrackpoints :: [Trackpoint]
,deriving (Show)
}
data Trackpoint = Trackpoint {
tpTime :: UTCTime
tpBpm :: String
,deriving (Show)
}
atTag :: ArrowXml a => String -> a XmlTree XmlTree
= deep (isElem >>> hasName tag)
atTag tag
text :: ArrowXml a => a XmlTree String
= getChildren >>> getText
text
-- Note: the hardcoded .000 part is kludge but for my inputs this was
-- an easy way to get timestamps to parse.
readt :: String -> UTCTime
= readTime defaultTimeLocale "%FT%T.000%Z"
readt
getTrackpoint :: ArrowXml a => a XmlTree Trackpoint
= atTag "Trackpoint" >>>
getTrackpoint -> do
proc x <- text <<< atTag "Time" -< x
time <- text <<< atTag "Value" <<< atTag "HeartRateBpm" -< x
bpm -< Trackpoint (readt time) bpm
returnA
getLap :: ArrowXml a => a XmlTree Lap
= getChildren >>> isElem >>> hasName "Lap" >>>
getLap -> do
proc x <- listA getTrackpoint <<< atTag "Track" -< x
pts <- getChildren >>> isElem >>> hasName "DistanceMeters" >>> text -< x
dist -< Lap (read dist) pts
returnA
getActivity :: ArrowXml a => a XmlTree Activity
= atTag "Activity" >>>
getActivity -> do
proc x <- listA getLap -< x
laps -< Activity laps
returnA
getActivities :: ArrowXml a => a XmlTree [Activity]
= deep (isElem >>> hasName "TrainingCenterDatabase" /> hasName "Activities") >>>
getActivities -> do
proc x <- listA getActivity -< x
activities -< activities
returnA
main :: IO ()
= do
main <- runX (readDocument [withValidate no] "test-act.tcx" >>> getActivities)
activities mapM_ printActivity (head activities)
where
Activity laps) = do
printActivity (putStrLn "Activity:"
mapM_ printLaps laps
Lap distance trackpts) = do
printLaps (putStrLn " Lap:"
putStrLn (" Distance: " ++ show distance)
mapM_ printTrackpoint trackpts
Trackpoint time bpm) =
printTrackpoint (putStrLn (" time: " ++ show time ++ " bpm: " ++ show bpm)