To the main heading_
Smallsite Design logo (SD) 390x390px

Smallsite Design

Online management help

2Find

!

This page lists all articles and categories alphabetically, and allows finding which articles have specific content.

The Work list page enables drilling down to articles by the subsites and categories that own them. While that works if the ownership tree is remembered, this page lists all articles and categories alphabetically to allow finding them by just their ID. There is also a list of the most recently edited, and a way to list all articles that contain particular text or elements, but it requires a knowledge of the internal structure of articles. While some examples will be presented, it is nowhere near exhaustive, and will probably never be.

Recent

The five most recently edited articles.

This list keeps track of the five last articles to be released, but excluding those currently being edited, which are already listed in the In progress section of the Work list page, and ignoring any changes made on Article head pages. Each row provides a jump to the Article head page for the article, along with how long ago it was updated, and its Show status.

Articles

All articles are listed alphabetically by ID.

This is a simple list, but while the navigation bar for it provides the usual periodic links into the list, it also provides a link to the first of each type of article by the first letter of their IDs, along with how many there are, in the form of a: 25. Each row provides a jump to the Article head page for the article, along with its Show status.

Categories

All categories are listed alphabetically by ID.

This is a simple list, where, if more than four categories, its navigation bar provides the periodic links into it. Each row has a jump to the Category page for the category, along with its Show status.

Find

List all articles containing specified text or elements.

The differences between this facility and the search facility are:
  1. a.This offers finding elements by their place in the semantic structure of article bodies.
  2. b.Both search for text in listed article bodies, as well as their headlines, bylines, navigation text and introductions.
  3. c.This includes non-listed articles and the latest edit versions.
  4. d.Search includes listed category headings and descriptions.
  5. e.Search filters out embedded characters, whereas this allows finding them.

The body content of each article across all locales is stored in a single XML file, the hierarchical structure of which matches the semantic hierarchy of the article. The standard method to search within XML is XPath, which is specially formatted plain text that enables specifying the hierarchical relationships of its elements.

The find options are:
  1. a.Text – find the text. Case-sensitive.
  2. b.XPath – find the matching elements.

The Text option counts the number of text nodes or locale attributes that contain the text, regardless of how many instances of the text may occur in each. This means that the numbers shown are the minimum occurrences, but there may be more.

While the Text option is quite straightforward, the XPath option is quite sophisticated, but to use its more advanced features requires deep knowledge of the structure of the XML being searched. Only some simple examples will be provided here, while the Elements section lists the full structures of elements. The worst that can happen is the XPath command is invalid. No articles will be harmed by using this facility.

Both options allow including embedded characters in the expression. For example, using the Text option to search for left-to-right markers using lr^, will show all articles where that character has been used to correct directional rendering mismatches.

The section includes a text area into which the text or XPath expression is typed, and clicking the Submit button below it will open the latest release and edit files for each article to see how many matches can be found in each. An error in the XPath expression will be indicated, but no clue as to what is in error will be given. If any articles matching a correct expression are found, they will be listed below the form. It will only take a couple of seconds, even if there are hundreds of articles.

The columns of the resulting list are:
#HeadingDescription
aArticleJump to the article's Article head page
bHeadNumber of article headlines, bylines, navigation text and introductions with matching text. If for glossary or polices pages, a
cReleaseNumber of matching elements found in the latest release. If no release version, a
dEditNumber of matching elements found in the latest WIP or draft. If no such version, a
eDoneShows a if the latest release is newer than when the find was executed

Typically, this facility will be to find articles that may need updating, but to help with keeping track of those, the Done column indicates which articles have been updated after the list was generated. This presumes that the edits are implementing what the find was for. Make sure that the latest edit of an article occurred after the find, as its timestamp is what is used as the time of the release's update, not when the release was instigated.

Elements

List of the element names and their XML tag name to search by.

While element blocks and hover buttons show the full names of elements, many of their XML representation names are shorter. Most inline element XML names (as shown in Inline insert) and many block element names are just the name of their HTML rendering elements, so at least when actual HTML is seen, they will be familiar.

While some HTML elements may have the same name in different levels in the rendered document's hierarchy, such as a section which can be a child of the article, or as a subsection in a section, to simplify the rules controlling what elements can be contained in each, they are given different XML names. For example, the XPath representation of the example given in the first paragraph of this section, relative to the art element at the root, is scn/ssn/table/tr/tc, where scn is a section, ssn is a subsection, tr is a row, and tc is a cell.

If that expression is used, it would indicate how many cells occur in subsections in each article. As there are a limited number of valid combinations of those elements, there are ways of shortening that statement, such as//ssn//tc. The // indicates any descendant, but a subsection can only appear in a section, and a cell can only appear in a table row, making this statement much shorter while still being just as specific. The Append cell of an element's element block indicates what it can validly contain.

Of course, getting to particular cells or any other elements in the path would require some extra information, and that is where it starts getting complicated, requiring some knowledge of the element's attributes and options, or where it stores its text. XPath is very powerful and generally concise, and while an alternative way of doing the same thing could be done using picklists and the like, it would very quickly become unwieldy, and still be very limited, though XPath expressions themselves can also get that way. The trick is in finding the most concise unique expression.


Legend:
  1. a.* – XML name is the same as the HTML name.
  2. b.! – text is in attributes named after their locales.
  3. c.~ – rich-text content.
  4. d.# – contains blocks.
  5. e.Inline elements contain their own text.
The formatting inline element tag names are:
t        textbr       break  *del      deleted  *bdo      direction  *em       emphasis  *ins      inserted  *key      key  *mark     mark  *ref      reference  *sel      selection  *s        strike  *strong   strong  *sub      subscript  *sup      superscript  *
The code inline element tag names are:
cy       commentarykbd      keyboard  *var      variable  *
The inline rich text element tag names are:
code     code  *  ~q        quote  *  ~qs       subquote  ~samp     sample *  ~
The functional inline element tag names are:
cite     citation  *mail     emailfile     fileicon     icona        link  *mic      media icontel      telephonetime     time  *val      value
Some basic block elements are:
hr       horizontal linep        paragraph  ~verse    verse  !btn/bc   button/comment  ~aside/#  aside  */blocks
The list structure is:
list     list  o       introduction  ~  li      item  ~
The figure structure is:
fig      figure  o      introduction  ~  fq     quote  !  lb     label
The diagram element structure is:
diag       diagram  o         introduction  ~  db        box  !  dt        text  !  dm        marker  dl        line  da        anchor  do        overlay    db       box  !    dt       text  !    dm       marker    dl       line    da       anchor    ds       suboverlay      db      box  !      dt      text  !      dm      marker      dl      line      da      anchor
The table element structure is:
table     table  o        introduction  ~  th       header    ta      heading  !  tr       row    tl      label  !    tc      cell  ~  tf       footer    tl      label  !
The sequence element structure is:
seq        sequence  o         introduction  ~  spt       part    scp      caption  !    sim      image    spr      pointer    sau      audio    sgp      group      scp     caption  !      sim     image      spr     pointer      sau     audio    str      track    sop      output
The general article element structure is:
#        blocksgl       glossary  o       introduction  ~  ge      entry  !  ~scn      section  #  gl      glossary    o      introduction  ~    ge     entry  !   ssn     subsection  #    gl     glossary      o     introduction  ~      ge    entry  !  ~
The procedure article element structure is:
#          blocksproc       procedure  pr        role  !  o         introduction  ~  stepa     step link    sa       action  #      sn      notes  #    sr       response  #      sn      notes  #  step      step    so       objective  ~    sa       action  #      sn      notes  #    sr       response  #      sn      notes  #  steps     substeps    so       objective  ~    ssa      substep link      sa      action  #        sn     notes  #      sr      response  #        sn     notes  #    sst      substep      so      objective  ~      sa      action  #        sn     notes  #      sr      response  #        sn     notes  #
The test article element structure is:
#         blocksqns       questions  qn       question    inf     information  #    qt      statement  ~    qo      option  !    qi      incorrect  ~  cms      comments    cm      comment  ~
The glossary page structure is:
#        blocksgls      glossary  ge      entry  ! ~
The policies page structure is:
#        blockspsf      recommended section  pi      item  !  p       paragraph  ~psn      optional section  pi      item  !  p       paragraph  ~

Attributes and options

Some clues to how to derive the likely names for element attributes and their option values.

Specifying attributes and option values can be a way of narrowing down which elements in the XPath statement are included, and thus minimising the number of articles to look through in the output list.

Many elements have attributes that specify either structure or appearance options, in addition to some of their text. While the attributes section of the element block shows their full names, their presentation in the XML is very terse, often only by single letter. These are usually the initial letter of the English name, but to avoid expanding names to avoid duplicates, another letter of a strong part of the name is used, such as the last letter or consonant.

For example, for colouring the text for some inline elements, the blue option is b, but the brown option is n. As colour can be applied to several inline elements, to avoid precluding the use of meaningful letters for any other options that those elements may need, x is used for the colour attribute name. For some attributes, the default option means there is no actual attribute, but that can also be tested for in XPath.

Attributes are denoted by a preceding @ in expressions, such as @n. Test for no attribute by not(@n).

Basic XPath find examples

These are some basic Xpath expressions that may be useful.

Be aware that PHP is limited to XPath 1.0 syntax and functions, so do not use what is only available in later versions.


Some basic examples of things to find are:
Elements
//table
//*[name='table']
self::art[//p and //fig]
self::art[//p and //fig]//*[contains('fig|p',name())]
where the first two are equivalent in finding all tables, the third indicates whether the root element has both paragraphs and figures, and so will show 1 if they do, whereas the last will show the numbers of those elements. Note that if wanting to be unambiguous about the list of elements that may have name parts in common, but allows for easy expansion of the list, the ending qualifier for the last expression would have to be like:
//*[contains('|a|table|',concat('|',name(),'|')].
Disabled elements
//*[@s]
//*[contains(@s,'m')]
//*[contains(@s,'n')]
which list all articles with disabled elements, only those manually disabled, and only those with errors, respectively. The last will include every element that has children with errors, as that helps to troubleshoot errors, but will not list released articles, which must have no errors.
Minimum number of an element
self::art[count(//p)>5]//p
which will list articles with more than five paragraphs.
Has sections without numbering
self::art[not(@n) and scn]
where the not(@n) indicates that the article does not use numbering. The scn is included to make sure that only general articles are found, and actually have some sections, otherwise whether they use numbering is irrelevant. Note that if wanting to find articles with numbering but no sections, use:
self::art[@n and not(scn)].
Has figures with prefixes
//fig[@p=//*[contains('list|table',name())]/@p]
which finds all figures that have prefixes that match a list or table somewhere in the article. The second // makes sure that the search for matches start from the article root again, as the matching list or table can be anywhere in it. For example, a figure may be in an aside, but its matching table is next to the aside so they appear side by side, as in Procedure article.
Links   Latest articles&Subsite links

Powered by: Smallsite Design©Patanjali SokarisPrivacy   Manage\