Archive

Posts Tagged ‘xml’

How to download your WordPress.com stats in CSV, JSON, or XML format

January 30, 2014 11 comments

I wanted raw data about the popularity of my various posts on this blog to better determine what sort of topics I should post about. WordPress.com provides some nice aggregate stats, but I wanted more. After stumbling around the Internet for awhile, I cobbled together a way to download my blog data in either CSV, XML, or JSON format.

There are three steps:

  1. Get an API key
  2. Get your blog URL
  3. Construct the URL to download the data

Get an API key

Akismet is WordPress.com’s anti-spam solution. Register for an Akismet API key at http://akismet.com/wordpress/ by clicking on “Get an Akismet API key”.

Sign up for an account. If you choose the personal blog option, you can drag the slider all the way to the left and register for free. If you value the service that Akismet provides, you can pay more. When you complete the signup flow, you will be provided with a 12 digit ID. Copy this down.

Get your blog URL

Copy the full URL of your blog, minus the leading https://. For me this is developmentality.wordpress.com.

Construct the URL

There is a limited API for downloading your data at the following URL:

http://stats.wordpress.com/csv.php

View this in a browser to see what the API parameters are.

API as of 2014/01/28

Construct the url

http://stats.wordpress.com/csv.php?api_key=<api_key>&blog_uri=<blog_uri>

View this URL in the browser (or via wget / curl) and you should see the view data.

CSV rows returned

There are multiple data sources. From the documentation:

table String One of views, postviews, referrers, referrers_grouped, searchterms, clicks, videoplays

Here is some sample data from each table. Change the format param from csv to json or xml to get the data in different formats.

views

CSV

"date","views"
"2013-12-31",118

JSON

[{"date":"2010-02-05","views": 46}]

XML

<views>
    <day date="2014-01-01">112</day>
</views>

postviews

CSV

"date","post_id","post_title","post_permalink","views"
"2014-01-28",369876479,"Three ways of creating dictionaries in Python","https://developmentality.wordpress.com/2012/03/30/three-ways-of-creating-dictionaries-in-python/",46

JSON

[{"date":"2014-01-29","postviews":[{"post_id":369876479,"post_title":"Three ways of creating dictionaries in Python","permalink":"http:\/\/developmentality.wordpress.com\/2012\/03\/30\/three-ways-of-creating-dictionaries-in-python\/","views":22},{"post_id":369875635,"post_title":"R - Sorting a data frame by the contents of a column","permalink":"http:\/\/developmentality.wordpress.com\/2010\/02\/12\/r-sorting-a-data-frame-by-the-contents-of-a-column\/","views":16}]}]

XML

<postviews>
    <day date="2014-01-30"></day>
    <day date="2014-01-29">
        <post id="369876479" title="Three ways of creating dictionaries in Python" url="https://developmentality.wordpress.com/2012/03/30/three-ways-of-creating-dictionaries-in-python/">54</post>
    </day>
</postviews>

referrers

CSV

"date","referrer","views"
"2014-01-28","http://www.google.com/",63

JSON

[{"date":"2014-01-30","referrers":[]},{"date":"2014-01-29","referrers":[{"referrer":"http:\/\/www.google.com\/","views":66},{"referrer":"www.google.com\/search","views":27},{"referrer":"www.google.co.uk","views":10}]}]

XML

<referrers>
    <day date="2014-01-30"></day>
    <day date="2014-01-29">
        <referrer value="http://www.google.com/" count="" limit="100">66</referrer>
    </day>
</referrers>

referrers_grouped

CSV

"date","group","group_name","referrer","views"
"-","Search Engines","Search Engines","http://www.google.com/",1256

JSON

[{"date":"-","referrers_grouped":[{"referrers_grouped":"Search Engines","views":{"http:\/\/www.google.com\/":1305}}]}]

XML

<referrers_grouped>
    <day date="-">
        <group domain="Search Engines" name="Search Engines">
            <referrer value="http://www.google.com/">1305</referrer>
        </group>
    </day>
</referrers_grouped>

Dates aren’t included so it’s the sum over the past N days, defaulting to 30. To change this, set the days URL parameter:

http://stats.wordpress.com/csv.php?api_key=<api_key>&blog_uri=<blog_uri>&table=referrers_grouped&days=<num_days>

searchterms

CSV

"date","searchterm","views"
"2014-01-28","encrypted_search_terms",190

JSON

[{"date":"2014-01-30","searchterms":[]},{"date":"2014-01-29","searchterms":[{"searchterm":"encrypted_search_terms","views":159},{"searchterm":"dynamically load property file in mule","views":2}]}]

XML

<searchterms>
    <day date="2014-01-30"></day>
    <day date="2014-01-29">
        <searchterm value="encrypted_search_terms" count="" limit="100">159</searchterm>
        <searchterm value="dynamically load property file in mule" count="" limit="100">2</searchterm>
    </day>
</searchterms>

clicks

CSV

"date","click","views"
"2014-01-28","http://grab.by/grabs/b608b9c315119ca07a1f7083aabbb9c7.png",3

JSON

[{"date":"2014-01-30","clicks":[]},{"date":"2014-01-29","clicks":[{"click":"http:\/\/www.anddev.org\/extended_checkbox_list__extension_of_checkbox_text_list_tu-t5734.html","views":2},{"click":"http:\/\/android.amberfog.com\/?p=296","views":2}]}]

XML

<clicks>
    <day date="2014-01-30"></day>
    <day date="2014-01-29">
        <click value="http://www.anddev.org/extended_checkbox_list__extension_of_checkbox_text_list_tu-t5734.html" count="" limit="100">2</click>
    </day>
</clicks>

videoplays

I am not sure what this format is as I have no video plays on my blog.

Conclusion

I hope you find this useful. I’ll make another post later showing how to crunch some of this data and extract meaningful information from the raw data.

New lines in XML attributes

April 26, 2011 Leave a comment

If you have an attribute in xml that spans multiple lines, e.g.

 q2="2
B"

you might expect the newline literal to be encoded in the resulting string when the attribute is parsed. Instead, the above example will be parsed as “2 B”, at least with Java’s SAX parser implementation. In order to have the new line literal included, you should insert the entity & #10; instead (this entity keeps getting eaten by wordpress, so ignore the space) This StackOverflow answer by Tomalak gives some more insight:

Bottom line is, the value string is saved verbatim. You get out what you put in, no need to interfere.
However… some implementations are not compliant. For example, they will encode & characters in attribute values, but forget about newline characters or tabs. This puts you in a losing position since you can’t simply replace newlines with
beforehand.

Upon parsing such a document, literal newlines in attributes are normalized into a single space (again, in accordance to the spec) – and thus they are lost.
Saving (and retaining!) newlines in attributes is impossible in these implementations.

Categories: Java Tags: , ,

How to use Java .properties files in Mule

September 18, 2010 2 comments

Externalizing ports and IP addresses (or anything else for that matter) in Mule

Mule is a great piece of open source software known as an Enterprise Service Bus. It is designed to make it easy to integrate various systems which were not explicitly built to work with each other. For instance, it handles all the details of various transport mechanisms (e-mail, HTTP, TCP, UDP, files) that your data might be shuttled around in, as well as the transformers that convert data from one format to another (e.g., the bytes of a TCP packet, into a String, into an XML document, into a Java object representing that XML).

Mule services are configured via XML, in particular the Spring framework. This post is designed to inform the reader as to how to incorporate Java .properties files into the XML.

.properties files, for those who are unfamiliar, is a simple Key=Value storage mechanism in widespread use in Java development. From the wikipedia explanation, here are a few lines of a .properties file:

website = http://en.wikipedia.org/
language = English
# The backslash below tells the application to continue reading
# the value onto the next line.
message = Welcome to \
          Wikipedia!

The documentation for Mule / Spring advises that you break up configuration files into multiple files and then reassemble them as needed. In Mule’s case, this allows you to run one instance of mule per small function, allowing you to restart just the piece that needs to when a change is made, rather than bringing down all the pieces. Furthermore it makes it easy to reason about each modular piece when broken down in this way.

Unfortunately, splitting the files up like this can easily cause a lot of duplication, especially of IP addresses and ports. If you are sending objects to the same IP address and port from multiple configuration files, you might end up with multiple instances of lines of configuration like

tcp:inbound-endpoint address="tcp://192.56.33.21:235"

Fortunately, by storing the IP addresses and ports in .properties file, you can eliminate code duplication and allow the variables to be changed in a single place and have the change reflected in all files referencing these variables. Additionally, if you name the variables properly, the impenetrable IP addresses instead become self documenting strings:

tcp:inbound-endpoint address="${email.server.address}"

This ${x} syntax should be familiar to anyone who has used Ant in the past. This basically says, find the property with the key email.server.address and textually substitute its value here. This assumes that you have a .properties file with the line

email.server.address=tcp://192.56.33.21:235

For the purposes of this post, assume that the line is defined in a file called test.properties.

Unfortunately, the way to do this is not clearly defined in any document I’ve seen, which is why I want to explain how to do it here.

Existing information

The first result in google for “mule java properties” is Mule: Configuring Properties, which is 5 years old and refers to Mule 1.5 (I figured this out the hard way). There is more up to date information by searching for “configuring properties”, particularly Configuring Properties – Mule 2.x User Guide.

Here is the relevant information:

To load properties from a file, you can use the standard Spring element
:

 xmlns:context="http://www.springframework.org/schema/context"
 xsi:schemaLocation="http://www.springframework.org/schema/context
 http://www.springframework.org/schema/context/spring-context-2.5.xsd"
 <context:property-placeholder location="smtp.properties"/>

(In other words you need to add those schema declarations at the top of your file, after the description field, before creating a context:property-placeholder element to tell Mule all the .properties files you want pulled in to your file).

Would that it were so easy.

If you change this example to the name of your .properties file, you will probably get a FileNotFoundException, even if you give the complete path to the .properties file.

A Fatal error: class path resource [C:/Mule/mule-standalone-2.2.1/conf/test.properties] cannot be opened because it does not exist

This message is extremely unhelpful because the file does exist at that exact location. After some digging, I found two workarounds, one of which is hinted at by the error message, another is not.

Classpath

By default, the .properties file is searched for in the classpath that the Mule environment runs in. Thus you need to ensure that the folder in which your test.properties file is located in is also on that classpath.

The wrapper.conf file in $MULE_HOME/conf is where the classpath is defined:

wrapper.java.classpath.1=%MULE_LIB%
wrapper.java.classpath.2=%MULE_EXE%/../conf
wrapper.java.classpath.3=%MULE_HOME%/lib/boot/*.jar

If you place your .properties files in any of those folders, it will be picked up. That’s probably not the ideal place for your files, however. If you wish to add an additional folder, you merely add another entry. For instance, if I store the .properties files in /Dev/Mule/Configs/Properties, I would add the line

wrapper.java.classpath.4=/Dev/Mule/Configs/Properties

Note that you must add the consecutive numbers to each classpath you add, or Mule will not pick up on the change correctly.

You can make it explicit to the readers of your configuration file that you are including a .properties file that’s located on the classpath with the classpath prefix:

<context:property-placeholder location="classpath:test.properties">

Absolute locations

If you wish to specify the absolute path to a resource rather than relying on classpath resolution, you must prefix the path with file///. So our previous example becomes

<context:property-placeholder location="file///Dev/Mule/Configs/Properties/test.properties">

(You also need 3 slashes even if you’re on Windows)

Conclusion

It’s a very good idea to externalize ports and IP addresses from the XML configuration files that Mule needs to run. This allows you to make changes to the ports and IP addresses in one place rather than in all the files that reference them. It also allows you to associate a more meaningful name to the addresses than an IP address; it is self-documenting in that regard. Unfortunately the process for importing .properties files into your Mule configuration files is not well documented, which is what I am attempting to remedy here.

Categories: Java, mule Tags: , , , , ,