Reflection for LIBR 509
The Journey
The course can be roughly divided into three sections
- Create exercises
- Analyze exercises
- System integration exercises
In the create exercises, we follow examples and create simple sub-systems in a small and manageable size.
In the analyze exercises, we learn to critically analyze systems already in place.
In the system integration exercises, we learn to understand how these subsystems work together. We are also tasked to create our very own integration
What works well
- It is a very refreshing and structural approach. My background is in technology and marketing, and I am no stranger to these systems. Rarely do I, however, would critically look at each of the component and understand how each of them work
- I also learn from many students in different disciplines and understand how information systems would be used in different context. It is very important as you start to understand why systems designed for different purpose and audience could be so different.
- There is no one single best system; each system has its own sets of trade-off and might not work for all incidences
- My background in marketing makes me to have certain tendency to favour view, impressions, potential match over and more selective result set. As marketers, it is a good thing to be exposed and offer an option as long as it is relevant. Relevancy can be addressed by search order and machine learning algorithms. In other settings, I have learned that such approach might have a negative impact on the user experience. It takes some adjustment and I really enjoy adding another dimension to my thought process.
What could be better
- As previously stated, as there is no single right answer, it is very important to understand the context of the works.
- A lot of the learning requires the insights from the designer or the users. It is not always immediately obvious to understand why the classmate make certain design decisions. A more interactive approach with an opportunity to ask question and provide feedback based on those conversation would make the learning much more meaningful
- As many systems on the internet are designed as an entire system, there are certain vague boundary re: classification, controlled vocabulary, data format, and content schema. For instance, the schema.org entities (Data schema and content standard) are designed to work with HTML and other technology while following many already established standards and convention (ISO, etc. and these will often act as the Controlled Vocabulary). Entities in hierarchy are often acting as classifications as well (i.e., category.) Furthermore, most of the data could be exchanged in XML or JSON format (Data format). In short, the schemas at schema.org and the definition in themselves have encompassed all facets (pun not intended) of the system.
It will be very difficult to analyze the subsystems in silo. It also causes confusion during the analysis exercises.
The Next Steps
For system integration create, it is a creative exercise that allows one to learn to put everything together. For the purpose of the assignment and the journey for the MLIS program, I have started created the resources (and other subsystems such as Classification, Thesaurus, etc.) using the same theme (Point of Interest in Japan) to create a more cohesive, ongoing learning opportunity. Some of the assignments have been redone from scratch, and some of the implementation work is already in the WordPress system. I look forward to expanding the use of the prototype to future course works.
The Create Exercises
Classification
- Some points of interest could fall under multiple point of interest types; for example, a hotel could also have a restaurant operating under the same name. In such case only the primary type is considered. For real-life application, we would probably want to include all classifications for each resource and let the search algorithm to determine the optimal order. For an itinerary website, the benefit of showing a potential match probably outweigh the cost of serving an irrelevant result. Machine Learning algorithms will help prioritize the results and improve accuracy and specificity over time.
- In the above example, each prefecture is part of a region, which in turn is part of an island. The hierarchy is very clear with no overlaps; that is, if prefecture is within Kanto, that prefecture has to be on the Honshu Island. It will not be the case for the 広域 (wide area) definition, in which some prefecture could fall under a different wide area from other prefecture within the same region. For instance, the Mie prefecture, which is part of the Kinki Region, is part of the 中部圏 (Chubu Circle) while other prefectures in the Kinki Region belongs to the 近畿圏 (Kinki Circle).
Thesaurus
- If the relationship is set up properly within the information system, the following thesaurus can be easily generated automatically.
- Each BT-NT pair represents a parent-child relationship; Each USE-UF pair is similar to an alias in computing term. Each RT pair indicates a related item relationship. All these relationships can be easily represented and modeled in relational databases.
The Analyze Exercises
I think there is some confusion re: this being a discussion of multiple standards. The BCP 47 is specifically designed to work with other already published standards such as the HTML specification and ISO-3166 country code standards, and it inherits many of the same limitations of these standards. For this example, we are looking only at the classification aspect of the standard.
The classification and Controlled Vocabulary exercises were done in a pair. The Content Standard and Data Format exercises were similarly done in pair as well.
Classification
For instance, the spoken language Cantonese, a Chinese dialect, (or the unofficial written way of the spoken dialect) cannot currently be represented by such classification. This limitation, for example, has nothing to do with the HTML specification (which uses the BCP 47 to represent a language in a HTML document but not the only usage for the BCP 47 reference) or the ISO country specification (the standard being leveraged to indicate a country, which is a small but integral part of the BCP 47 specification).
Controlled Vocabulary
For instance, the valid value of a language must use one of the recognized language and recognized ISO country. As one may see, the controlled vocabulary enjoys many of the strengths of the specification but also suffers many of its weaknesses at the same time.
Content Standard
The schema.org event schema, similar to other published schema at schema.org) is designed to provide a guideline for expected interoperability. It has made some great trade-off re: being generic and flexible enough for most use cases.
It is also interesting to note that major industry players come together to support the standard as interoperability is key for information access and distribution.
Data Format
The data format is essentially a JSON output that can be consumed by APIs and web services. Systems planning to support the event schema would consume the JSON input and parse the individual attribute (as specified by the event schema) into their own native, internal format.