• Không có kết quả nào được tìm thấy

A Tale of Two Systems

N/A
N/A
Nguyễn Gia Hào

Academic year: 2023

Chia sẻ "A Tale of Two Systems"

Copied!
148
0
0

Loading.... (view fulltext now)

Văn bản

And, of course, with a new architecture some of the mistakes and unfortunate choices built into the old architecture can be avoided. For each of the technologies and for most of these case histories we provide—mostly online—exploratory exercises.

TeachEngineering (TE) Overview

Introduction

Brief History

The TeachEngineering Resource Collection

Lessons and activities do not have to be part of units; they can live 'on their own' and activities can be part of curricular units without being part of a lesson. Five activities (A2-A6) are located on lessons and one (A1) is located directly on the unit. b) The All Caught Up curriculum consists of two lessons and three activities.

Controlled Resource Content

While the exact list of document components—different for different resource types—is not important here, it's important to realize that these components come in two types: mandatory and optional. The last point, the automatic generation of metadata, is important because it is a classic bottleneck in the .

Figure  3  shows  part  of  a  TeachEngineering  activity  as  it  appears  in  a  user’s  Web  browser
Figure 3 shows part of a TeachEngineering activity as it appears in a user’s Web browser

K-12 Educational Standards

TeachEngineering has enjoyed a long-standing relationship with ASN, and although there are major differences between how TE 1.0 used to be and TE 2.0 now uses ASN, ASN continues to function as the de facto repository of TeachEngineering's K-12 standards. While standard K-12 tracking is readily available from services such as ASN, the standard scope; that is, matching learning resources to standards is much more problematic.

Collection Editing and Resource Accessioning

While it is not our goal here to thoroughly analyze the standards alignment problem, it is important to know that in TeachEngineering each resource is aligned to one of several K-12 education standards, and since standards change frequently, these alignments must also be updated regularly.

System Implementation and Collection Hosting

Extras

Continuous Quality Control

Why Build (Twice!) Instead of Buy or Rent?

Build, Buy or Rent?

It makes one wonder how much more we could achieve if we didn't have to 'waste' all these resources to protect against the 'bad guys'. From a "build or buy" perspective, this means that we must either build security into our programs ourselves (build) or trust that such security is built into the services and products we obtain from others (buy).

So Why Was TeachEngineering Built Rather Than Bought… Twice?

The lack of user interface configurability of these early systems led many other DL initiatives to develop their own software. Good examples include projects such as the Applied Math and Science Education Repository (AMSER), the AAPT ComPADRE Physics and Astronomy Digital Library, the Alexandria Digital Library (ADL), the (now defunct) Digital Library for Earth System Education (DLESE), and many others.

What About TE 2.0?

It would of course have been possible to adapt any of the popular content management systems to work with this structured data. Finally, the three most popular content management systems are built using the PHP programming language.

Figure 1: Hierarchical structure of TE documents. (a) General example. The curricular unit C1 has three lessons  (L1-L3) and six activities (A1-A6)
Figure 1: Hierarchical structure of TE documents. (a) General example. The curricular unit C1 has three lessons (L1-L3) and six activities (A1-A6)

A Word on Open Source

TE 1.0 – XML

In this chapter, we introduce XML as a data representation and exchange format and give some examples of how it is used in TE 1.0. In the next chapter of the JSON chapter, we go deeper into how XML is used in TE 1.0 and how JSON is used in TE 2.0.

Representing Content With XML

Instead, we need to provide an explicit semantic model of the document content, along with the document itself. Note that your web browser recognizes the content of the file as XML and displays it accordingly.

Figure 1: The first five bars of the main melody of Beethoven’s  Symphony No 5.
Figure 1: The first five bars of the main melody of Beethoven’s Symphony No 5.

XML Syntax Specification: DTD and XML Schema

One that is currently in active use by the US Securities and Exchange Commission (SEC) is XBRL for business reporting (Baldwin & Brown, 2006). For example, going back to our TeachEngineering 1.0 lesson example, the DTD/XSD for the lesson would specify that a lesson document should have a title, a target grade set, one or more keywords, one or more benchmarks, time and cost estimates, etc.

Note: Well formed ≠ Valid

An XML document is considered well-formed if it obeys the basic rules of XML syntax. This happens if the document obeys basic XML syntax rules but violates DTD/XSD rules.

Table 1: relationships between XML well-formedness and  validity
Table 1: relationships between XML well-formedness and validity

Must All XML Have a DTD/XSD?

The document is therefore not well formed and any malformed document is considered invalid (cell 3 in Table 1). However, an XML document can be well-formed and still be invalid (cell 2 in Table 1).

Enough Theory. Time For Some Hands-on

Documents Coded and Stored as XML

However, a quick look at the activity URL shows that the view_activity.php program is passed the parameter url, which is set to the value collect/mis_/activities/mis_eyes/mis_eyes_lesson01_activity1.xml. While view_activity.php renders the activity, it extracts the tags and their identifiers from the XML.

Service-Oriented Architectures and Business Process Management

Web Services Example I: K-12 Standards

As mentioned in the introductory chapter, all of TeachEngineering's curriculum is aligned with K-12 STEM standards. Although the TE team is responsible for these adaptations, it is not in the business to track the standards themselves.

Web Services Example II: Metadata Provisioning

In this approach, a search query is distributed among the various members of the association; in this case the various NSDL member libraries. Hide each of the s (click the '-' sign in front of each one).

Serving Different XML Formats with XSLT

But if you look at the tag at the bottom of the XML, you'll see that completeListSize=1581. In other words, we need to 'translate' the first XML format into the second XML format.

XSLT in TE 1.0

TE 2.0 – JSON

JavaScript Object Notation (JSON) – The New XML

Of course, it doesn't matter whether the JSON is embedded in JavaScript, as in our example above, or whether we get it from an external source. Notice that this is almost exactly what we did in the exercise in the previous chapter when we extracted XML from a web resource and parsed it.

JSON in Python

DTDs or XSDs for JSON: JSON Schema

An example would be many of the data sets provided by the US government on sites such as data.gov, cdc.gov, or census.gov. You only have to look at the thousands of datasets available through www.data.gov to see that JSON is increasingly one of the formats being offered for data retrieval.

Table 2: Comparison of XML and JSON data sets found at www.data.gov.
Table 2: Comparison of XML and JSON data sets found at www.data.gov.

TeachEngineering (TE 2.0) Resources as JSON structures

After all, the designers of TE 2.0 could have easily chosen to include only the standards references in the JSON representation and leave out the standards' content. Why this reshuffling of content and formatting of the XML and JSON after all this work in the late 1990s and early 2000s to separate them.

Table 3: Ten most referenced K-12 education standards  in TeachEngineering and their number of occurrences
Table 3: Ten most referenced K-12 education standards in TeachEngineering and their number of occurrences

Introduction: Relational Is No Longer the Default MO

Key-value stores are databases that function as lookup tables in which values ​​to be looked up are indexed by a key, as shown in Figure 1. As the Wikipedia page on key-value stores shows, there are quite a few implementations available today.

Figure 1: Lookup structure of a key-value store (source:
Figure 1: Lookup structure of a key-value store (source:

So, What Gives?

XML/Relational

Therefore, we needed a way to index the content of the resources as written by the curriculum authors in the MySQL relational database. Even better, when structural changes are needed to the database, all we have to do is change a few entries in the meta tables and let the program generate a brand new database while the existing production database and system remains operational.

Data Tables

The Types table in turn declares for each component the data type, null value, and an XPath expression to be used to extract the content from the XML source. Extraction of sprinkle information from the sprinkle XML documents to be stored in the tables was done automatically via the corresponding XPath expressions.

JSON/NoSQL

In TE 1.0 we could not have just stored the typical resource lookup information in the database; for example, title, target figure, time required, summary, keywords, etc., but also the full text of sources. Additionally, due to the relatively small number of resources stored in TE 2.0's RavenDB database, the '.

Figure 2: Heat flow activity lists Related Curriculum
Figure 2: Heat flow activity lists Related Curriculum

Some Practice with a JSON Document Store: MongoDB

Our sample contains six TE 2.0 sprinkler resources and is stored at https://classes.business.oregonstate.edu/reitsma/sprinkles.json. The result of running the program should be 3 (Check this against the sprinkles.json file).

Summary and Conclusion

Resource Accessioning

All libraries, digital or not, have processes for formally accepting and including items in their collection; Chief among these differences is that while in TE 1.0 editing and adding resources were two separate processes performed and controlled by different people, in TE 2.0 they were integrated into a single process performed by the resource editor.

Authoring ≠ Editing

  • Tagging: Not Quite WYSIWYG XML
  • Document Ingestion and Rendering
  • Tagging: Much More WYSIWYG JSON
  • Document Ingestion and Rendering
    • The Develop… Test… Build… Deploy Cycle

Rendering in TE 1.0 was done in PHP, then one of the more popular programming languages ​​for web-based programming. As noted earlier, curriculum resources in TE 1.0 stored only the identifiers of the standards to which the resource was aligned.

Figure 1: Editing a section of the Tug of War activity’s XML representation in Altova’s Authentic
Figure 1: Editing a section of the Tug of War activity’s XML representation in Altova’s Authentic

But First: Version Control

Develop-Test-Deploy

Before TE 1.0, this was a website that, while visible and accessible to the world, was anonymous because there were no links pointing to it on the web. Code developed in the sandbox is typically reviewed by TE 1.0 project members for functional adequacy and robustness.

Develop-Test-Deploy

In TE 2.0, developers code new features on their local machine using a shared (development) database instance. When new feature development has progressed to the point where it is ready to be included in the next production release, it is merged into the main branch of the code repository.

Behind the Curtain

Continuous System Monitoring

It also uses machine learning to detect events such as slow responses for users in specific geographic locations or an increase in the number of times a specific error occurs. Application Insights is also configured to access TE every five minutes from five different geographic locations.

TE Meta Monitoring

A New Application: The NGSS Explorer

The NGSS as a Network

The network is interactive as users can navigate it by interacting with nodes and standard texts, selecting different groups of standards, depth levels, graph display options, etc. For example, the network depth in Figure 2 is 2; i.e. from the K-ESS2 standard, we follow up to two connection steps.

Figure 1: NGSS Topic Earth’s Systems at grade K as a table of PEs and linked standards (The bolded standard codes in the  Articulation section, are links to tables representing those standards)
Figure 1: NGSS Topic Earth’s Systems at grade K as a table of PEs and linked standards (The bolded standard codes in the Articulation section, are links to tables representing those standards)

How Does All This Work?

When a user submits a request for a specific network, the JavaScript running in the browser extracts only the nodes and connections implied by the user's request from its complete network. When a user requests a specific network, and the JavaScript program running in the browser has determined which nodes and connections are involved, it sends a request to the TeachEngineering HTTP server for the location information of the network's nodes (step 1 in Figure 4).

Figure 5: NGSS Explorer architecture
Figure 5: NGSS Explorer architecture

A Word Processor and a Text Editor Are Not the Same Thing

The effect of these newlines, of course, is that when you drag the file into a text editor, both foo and goo are on their own lines. Therefore, if you insist on using word processing software for encoding, always save your file in .txt format (Note that you may need to rename the file after saving to adjust the file extension).

Which Text Editor to Use?

The Attack

All the machines making the requests were based in China (you can geotrace these IPs at https://gsuite.tools/traceroute) and all the requests came from software identifying itself as BitTorrent, a well-known file sharing protocol. This led us to conclude that somehow — most likely by accident, but possibly on purpose — our TeachEngineering machine had registered itself as part of the BitTorrent file sharing network and that we were being flooded with BitTorrent requests.

Now what?

What most likely happened

Inspecting the link automotivetouchup.com/touch-up-paint/green-is-more-than-a-paint-color-for-cars. HDDQCRX is a strange email name, so who or what is WHOISPRIVACYPROTECT.COM (WHOIS PRIVACY PROTECTION SERVICE, INC).

Figure 1: Home page of enrichingkids.com.
Figure 1: Home page of enrichingkids.com.

Detecting Robots

Two of the most telling variables are hit rates and inter-arrival times; i.e. the time elapsed between two consecutive visits by the same IP address. In addition, the standard deviation of the interarrival times (over the first 100 hits) is 38.12 seconds, indicating a very periodic hit frequency.

Figure 1: 15 largest April 2016 hitters and their hit counts.
Figure 1: 15 largest April 2016 hitters and their hit counts.

Hình ảnh

Figure 1: TeachEngineering’s original developers from the University of Colorado, Duke University, Colorado School of  Mines, Worcester Polytechnic University and Oregon State University
Figure 2: Hierarchical structure of TE documents. (a) General example. The curricular unit C1 has three lessons  (L1-L3) and six activities (A1-A6)
Figure  3  shows  part  of  a  TeachEngineering  activity  as  it  appears  in  a  user’s  Web  browser
Figure 3: partial rendering of a TeachEngineering activity.
+7

Tài liệu tham khảo

Tài liệu liên quan

In the figure shown below, colour each of the equilateral triangles with any one of 4 colours: blue, yellow, green or red so that no two triangles will have the same colour.. How