Around 4 years ago I managed an Accessibility project with MOSS2007 for a major client. The site averaged a million hits a day and we were legally required to meet AA compliance. This is also referred to Priority 2 by some and adds a whole set of checkpoints over and beyond priority 1. To read up more, take a look at the following link (http://www.dda-audit.co.uk/audit/wai-aa.htm ). Each new version of SharePoint has improved things, but the material is valid even now, and let’s faces it, a lot of public sector is still on MOSS2007.
SharePoint has been heavily criticized, and rightly so, for not generating ‘accessible’ markup. In the following sections I will tackle issues with SharePoint, as well as those outside the control of SharePoint to give a detailed picture. Remember this is a detailed guide; not a complete one and a search on the internet would show up many other issues (for instance, with iPad support) that have since become prominent.
Why is it important?
Following are some of the most common reasons;
- A general desire to engage those with needs for accessible content.
- To reach a broader audience; especially with mass adoption of ‘mobile’ and touch devices.
- The legislations may stipulate this, especially true of the public sector.
- Consistency may be improved. Certainly knowledge of accessibility can help improve the design of the site across different browsers and readers.
- Crawlers can better utilize the semantics within a document to understand and index the content better.
- Others?
Where do we start with SharePoint
Remember, Accessibility isn’t just about the end markup generated; it is the complete process from the moment an author starts to create content to what is delivered to the end user.
The above diagram depicts the whole process; content is either added directly by the user, it comes from external sources and is added/copied into SharePoint, or it may be directly accessed from external sources and shown on your SharePoint site. When it comes to Authoring content ourselves, we have a number of choices, and this is depicted in the following diagram. For the content that exists within our SharePoint content databases, we can also run tools to clean it. Do remember that content may be manipulated going in to content databases as well as at retrieval tine by SharePoint controls responsible for storing r retrieving it. The external data that we consume directly through content web parts, or via iFrames etc requires that we provide a level of cleaning within those. Finally we also need to not only consider the individual parts, but the sum of individual parts, the page, that whilst individual parts are accessible, but when put together on a page break compliance rules. This aspect is handled by reporting techniques that check the whole page and report compliance errors.
The following sections look at these in turn…
Rendering SharePoint Page
SharePoint page is made up of a number of Server Side controls that are placed on it in design mode. When a request is received, SharePoint renders the control with the dynamic generated content, mostly from its content database, but not necessarily so. A typical SharePoint server control might be an image tag. This may be bound to an image column on the server. When the page is rendered, SharePoint would spit out a variant of the <img> tag. Similarly, SharePoint also has tags for ScriptLink which render a script tag for JavaScript files and so on. Some of these design controls are visible within the master page, or a web part, others may be auto-generated by the SharePoint assemblies. Here is a low down on the main ones and what needs to be done to make them more accessible;
MasterPage – This contains a number of ScriptLink, CssLink and others that need to be rendered in a compliant way. Where direct Html is concerned, the rendering can easily be changed. Where Server side controls are concerned (like ScriptLink) then sometimes you either have the option of now using that control and that can be replaced by markup that is more appropriate. However, this is not always a viable option and you need to future proof this as well
Web Part Zone: By default (in MOSS 2007) SharePoint spits out a web of tables. Under accessibility tables should only be used for pure tabular data, not for positioning elements and layout. Luckily, a simple control adapter can be used that will output a DIV and assign classes appropriately to manage layout using CSS.
SharePoint Web Parts: This is the hardest area. Your solution will depend on which web part you are using. You would probably have the option to extend a web part and hence control rendering that way, or for simpler web parts you could develop your own. For instance, if all you are showing is a list of links, then it might be worthwhile to ‘mimic’ the core code (you can use a product like reflector to visualize SharePoint code in one of SharePoint assemblies).
SharePoint Document libraries: This is an area that I have not done. Recent versions of SharePoint have made it more accessible and you may find third party controls. An apology if this was the only reason you were reading this. I hope someone who knows further on this area can comment. I know, for instance, DocExplore for iPad, makes viewing document libraries on Apple possible, and may even have the requisite level of accessibility built in.
Custom Web Parts: You are the master of your world! However, when it is someone else who has developed these, unless they provide you with flexibility in rendering, the source code or a version that meets Accessibility requirements; there may be little you could do.
Field Controls: I have re-written a number of Field controls with more accessible versions. This is simple as you inherit from the base field control and then override it with your custom logic. On any page where the original field control is being used, you simply replace it (and the reference) with your version. It’s a simple method, but may have limited uses as you have to know exactly where a control is being used and also ina position to change its markup to that of yours.
Menus: Similar to field controls, web parts etc. Either substitute the default menu control in the master pages with your own or perhaps write a control adapter. My preference has been to replace the one that exists, but that’s purely because we used a single master page, a single look-and-feel and no ability for the user to customize and apply themes. CSS Friendly adapter contains an accessible implementation.
Content Authoring
The above looked at rendering, this looks at authoring. SharePoint allows you to author content in various places;
- Inside Content Editor Web Parts: These web parts can be placed on any web part page and custom text added.
- Publishing Pages and visible through PageLayouts: Any content field of type Html (Multiline Text and subtype of Html) will display a default Html editor control when editing. This will allow the input of Non-Compliant Html.
- Blogs: Blogs use a type of ‘mark-down’, a short hand version. Sorry not an expert on this, so I will refrain from commenting too much.
- Others?
The following sections will describe some of the tools that we can use to ensure that content is authored or retrieved in a compliant manner.
Authoring Tools and Tips
There are a number of tools that can help in authoring accessible Rich Html content. The built-in WYSIWYG Rich Text Editor (at least in Moss 2007) was far from compliant. A number of alternative exist;
Telerik’s HtmlRadEditor for SharePoint
This is what I have used. Out of the box it is capable, but we had to make a number of tweaks;
By default, it output <br /> when enter was clicked. We wanted paragraphs. In fact, our requirement was that no content should exist outside a paragraph. The main file that you need to change is the ConfigFile.xml under the control templates folder. My version is shown below. It shows a number of customizations, some of which had not been documented at the time. There are a couple of JavaScript functions that you can override as shown below (OnClientPasteHtml, OnClientLoad). These exist in MossEditorTools.js and we have modified these as will be shown later.
The Toolbar allowed a number of options that we wanted disabling. This is done in the ToolsFile.Xml. The following shows a snippet. Interestingly, it shows that you can use the build in MOSS tools, such as the image picker that comes with MOSS, instead of Telerik one;
The following two images show modifications to the MOSSEditorTools.js file.
What I have tried to show is a flavor of the various things that need to be done to make even the simplest of things; making an accessible Html Editor more accessible!
Once you make modifications to Telerik Rad Editor, you need to get the RadEditorMoss.ddf file and then recompile / regenerate the WSP. The Telerik controls sits at the same level as web server extensions folder in the 12 hive and you could not use the available solution generation tools.
TinyMCE
This is an alternative. It has been used much within ASP.NET to allow rich content where possible. I have not used it much, but it is definitely a worthy contender.
Overall, with whatever tool that you use ensure that the following criteria are met;
You are aware of the baggage it comes with. The amount of extra Http Requests it will consume and the extra page load that it will require.
- It can be used multiple times on the same page.
- It is extendable and adaptable. It must provide the necessary hooks to control what is displayed on the toolbar, what markup is generated and how you can customize it.
- It can support some level of preconfigured templates. Telerik didn’t when we used it last and unfortunately we failed to provide some of the productivity boosts and consistency we wanted.
- Copy and Paste from word etc. is supported in a way that ensure compliancy with accessibility standards.
- A potential way to switch of inline styling, especially when copy and paste is involved.
- A way to integrate validation and approval workflow. What this means is that the control (or page) should allow you to edit content, highlight breaking rules and only allow you to save a draft, but not publish. At the control level, the control should be able to signal to the page that it is not in a valid state for publishing and the page to have the ability to disable publish if it detects these signals.
Control Adapters
Every Server Side control has a default rendering; this determines what HTML it spits out. The idea is that you can override the default rendering for most (not all) controls that provide the appropriate hook and then, for a particular browser type, IE4, Mobile, Search etc., if configured in Compat.browser file for that IIS site, the alternative rendering will be used.
The following are some of the common control adapters that I have written, and possibly the most common ones you would encounter.
Once you have the control adapter in a compiled assembly, either in GAC or IIS bin directory, you need to reference it in compat.browser and let IIS/ SharePoint know when that adapter is to be substituted. You may have an adapter that is only substituted for certain mobile devices, or another that is only for search and so on. When a page with a particular control is encounter, the SharePoint/IIS rendering code checks whether to render it using the default or if an adapter is registered and should be used instead. The following shows a typical registyration.
To give a real life example of the usefulness (and ingenious ways) of using Control Adapters is that we had an interesting issue where we used a Teaser web part to tease content of some pages on others. Any given page could have a number of these teaser web parts that would tease content from other pages. This caused issues for search crawlers as they would index pages based on content of teasers, although the topic of that page is something completely different. We used a control adapter for teaser web parts to render nothing when crawled by a search engine. Simple but effective!
The following image shows the code for a Web Part Zone adapter. By default, SharePoint would spit out numerous tables, however, we have instead chosen to only use a single Div with a class. You could also add code that would iterate through the web parts and if any content exists, wrap then with their own div. This may give you a better ability to control the layout and the look & feel through CSS.
Sometimes SharePoint just doesn’t give you any hooks to latch onto. One interesting problem was SharePoint spitting out the language attribute when adding its core JavaScript files to the page. Unfortunately these could not be controlled at the master page level. As already mentioned changing the page render method was never a solution. Fortunately for me (after days of head-scratching!), I decided to try it through reflection and then removing the offending attributes from the array list of files that SharePoint generates to spit out. Anyway without getting into the complexities, its better understood with the actual code.
I am proud of it – still not aware anyone else figured it out as well. It worked!!! Performance can be improved using third party reflection libraries, but heck, I leave that as an exercise for someone else.
Overwrite Page Render
The idea is that you could run a bunch of regex expressions, clean out the offending markup and voila! There are two problems with this; firstly it might be easy enough to clear simple stuff, it is much harder to tackle a whole page without danger of affecting semantics, layout and flow. Secondly, any attempt to interfere with the page render methods, when on an authenticated site, will result in login and other substation controls being thrown out of their proper place and printed at the top or bottom of the page in a bizarre manner. SharePoint (or .Net) does not like this interference. Some people on various blogs have pointed this as a solution; but for those that allow customizations through form authentication etc. should stay clear of these naïve solutions.
Accessibility Controls
CSS Friendly Adapters is perhaps what most of you would be aware of and the main adapter here is the menu adapter that most MOSS sites have used. However, there is nothing stopping someone rolling their own controls that inherit from a SharePoint field control. I have developed the following to ensure the markup is what I want it to be
- An Image Field
- A Note field
- Link Field
The principal is quite simple as shown by the following;
Your control inherits from the base control and then overrides the render method with your own. When placed on a page (ASPX page), you import the assembly that contains your compiled code and use that in your control markup.
Html Agility Pack
This is a .Net assembly that can be downloaded from CodePlex. It is a forgiving parser that builds a Html DOM / Object model from even a typically malformed Html. It then allows XPATH, XQuery type retrieval, and also the latter version have support for Linq. This can help in cleaning content, but do look around as this may not be the only such library. I have used it and hence can recommend it. The benefit over typical regex parsing, detection and removal are based around the complexity of achieving something similar with RegEx, especially with multiline, missing closing tags, read ahead, look back and so on.
Other concerns
The Following are a few more concerns with content authoring that must be adhered to·
- Word copy: A lot of people tend to write using Microsoft Word (just like this blog post) and then copy & paste. Some editors generate simply unintelligible markup. If you’re looking at cleaning markup through tools, ensure you disable such copy & paste.
- Document structure: Whilst the content/markup is important, it is also important that there is only one H1, H2 follow H1 and so on. It is easier to lose this semantic when a page is created from various individual unrelated parts.
- Inline Styles: In one place, our Front-End developers required that no inline styling is required.
- An approval process that doesn’t stifle productivity and result in undue hindrance. Such an approval process must be built into content authoring to ensure the ‘cleanliness’ of data imporves.
On Going Maintenance
There are two concerns here;
- A report that can be done at a daily / weekly level. This will check the consistency of complete pages.
- A tool that can be run periodically to clean data. This can clean individual content fields, list columns etc.
Reporting
There are online tools that validate individual pages, Firefox extensions (and probably IE ones as well), or you could build your own that can traverse a given page list, or buy commercial ones (e.g. HiSoftware Compliance Sheriff). Regular reporting based on these tools should be incorporated into the workflow. Perhaps someone can write a PowerShell script to do it!
Content Cleaning
No matter what you do, the user will find ways to add content that is not fully clean. For instance, you may have specialized content from third party that is added via a batch process, you may have aggregated data from news sites, or you may have comments etc. that third parties provide. All this data needs to be stored in the background as is. However, when displayed it must meet the accessibility requirements. Similarly, whilst a number of editor make writing accessible content possible, they do not enforce it. A user is still able to add non-accessible content and unless you develop a workflow that disallows saving (and sometimes this is not possible anyhow) the content will require cleaning regularly to ensure that, at the very least, certain minimal standards are met; removal of <BR> when <P> rags should enclose paragraphs, some inline styling etc., etc.
The above shows some common RegEx expressions and some match evaluators. Match evaluators are a powerful technique that allow you to manipulate the match, once a match is found. To do something similar within a RegEx would be extremely cumbersome, if at all possible.
The idea behind the tool was to loop through all Publishing Sites (our main concern at the time) and then loop through all Publishing pages and within them loop through all fields (as shown below). We looked at certain types of fields (see code below) and once a populated field is found, we can clean and update the page with the cleaned markup.
The following shows a couple of Match Evaluators and a sample of things I had to clean.
As a separate post, I can describe a Cleaning tool that goes through the content on a publishing portal and cleans out various bits of data using simple search and replace, regex replace and some more advance techniques. Anyone interested in this, please shout!!!
Final thoughts
Document Libraries
I remember developing a custom web part for this. The web part used the NVelocity template engine and NVelocity templates for this. The data access used the SharePoint Object Model and CAML to query the data, although more options are possible now including Linq, oData, Client OM and so on. A basic set of columns can be shown, links to documents provided and some basic sort offered. However, you do loose the richness of SharePoint. Each newer version of SharePoint has made the document view more accessible though!
Edit Mode
I generally talked about view mode (content consumption), but editing SharePoint pages is a minefield in itself. Unfortunately, and especially MOSS 2007, there is little that you can achieve if edit mode AA compliance is required. By this I mean making the Edit Mode of a SharePoint more accessible, as opposed to simply ensuring the controls placed on an edit page enforce compliant Html to be stored. Most of what we talked about still applies, but a whole lot more issues will rear their head up. The pain is just too much!
Html Emails
If you are using Html emails and you thought dealing with multiple browsers was a pain; Email clients are much worse. It is generally commended to buy an off the shelf product and integrate with your SharePoint email functionality.
Ajax / JavaScript
Ajax cannot be used to cleanup markup as most screen readers may not support this method. The markup generated must be clean when delivered to the browser or another device.
The Combo-box
This is not a pure HTML control. It is a browser control and at least at one place I had to create a simpler HTML/CSS based alternative.
Page is made up of many element
The page is made up of discrete sections. A user, for instance, is able to add web parts from many sources. Two links, on two separate web parts, named the same can point to completely different pages. Whereas we talked about this in passing, it is necessary to reiterate it here, as it is easy to tackle problems in isolated controls, but when those controls are combined on a page, it can lead to problems.
Obviously, I have not covered everything, but I hope I have highlighted enough to make most people aware of the issues, and provided ways to tackle them. It has been a tough ask to write this, especially after around 3-4 years after the original work was done, but for thesake of the community I wanted to make it available.
Happy reading!