Feeds:
Posts
Comments

It it a well known fact that MOSS 2007 caused some rising opinions on the subject of list scalability and performance.  Many developers operated under the misconception that SharePoint lists only allowed 2000 list items before croaking out with bad performance.  Nothing could be further from the truth.

The following article talks about this issue of large lists in great depth and highlights the point that SharePoint can actually handle many more than 2000 list items in any one list.  However, “query” this data is affected by the item count and SharePoint architects should design their data access and presentation of data accordingly.

http://technet.microsoft.com/en-us/library/cc262813.aspx

Microsoft has added a number of new enhancements to lists in SharePoint 2010 to handle larger capacity and the query of this data, and the following is a short summary of the enhancements:

List Column Indexing

SP2010 now allows list designers to create up to 20 indices (some of multiple columns) on any one list.  These indices allow for faster query of data when the list size exceeds that of typical.

image 

Under list settings and in the columns section; users now see a link to Indexed Columns.

The following is a list of column types usable as indexed columns:

· Single line of text

· Choice field, but not multi choice

· Number

· Currency

· Date/Time

· Look up, but not a multi value look up

· Person or group, but not multi value

· Title, except in a document library

 

List Throttling

SharePoint administrators now have the capability to better control list queries so that developers (or general users) may issue list queries on large lists that may potentially bring down the server.  Specifically:

Administrators may define some limits at the web application level:

- Configure the number of items fetched for queries

- Administrators may receive warnings when thresholds exceeded

- Ability to configure time periods for expensive queries to operate

- Limit the size of list items (default to 8k)

- Limit the number of columns in a join (default to 6)

The following code will display the list throttling limits for the site collection:

using (SPSite site = new SPSite(siteUrl))
            {
                Console.WriteLine("MaxItemsPerThrottledOperation:{0}",
                    site.WebApplication.MaxItemsPerThrottledOperation);
                Console.WriteLine("MaxItemsPerThrottledOperationOverride:{0}",
                    site.WebApplication.MaxItemsPerThrottledOperationOverride);
                Console.WriteLine("MaxItemsPerThrottledOperationWarningLevel:{0}",
                    site.WebApplication.MaxItemsPerThrottledOperationWarningLevel);
            }

To enable list throttling on any list be sure to toggle the setting with the following:

SPList.EnableThrottling = true

MaxItemsPerThrottledOperationWarningLevel – If a list exceeds the number of items specified in this threshold then a warning is displayed on the list settings page.

What MaxItemsPerThrottledOperation – This indicates the number of list items returned to non-administrators.  Administrators can query up to the threshold in What MaxItemsPerThrottledOperationOverride but will receive a warning on the list settings page.

If administrators wish for users to execute expensive operations in specific window of time they can do so by using the following method on the WebApplication object: SetDailyUnthrottledPrivilegedOperationWindow

 

RBS Storage (Remote Blob Storage)

In some cases the use of document libraries to store large files is no longer scalable and causes content databases to become unmanageable.  An example situations where a public web site, hosted in SharePoint, provides users with rich media content – web files and large images – is once such example of the large blob storage issue. 

In MOSS, hosting content in the database provided certain benefits, such as single storage location, versioning, and access of files via the object model.  Whereas file based storage provided better scalability at the cost of orphaned content from the object model.  SP2010 solves this issue with RBS.  Site Architects can now store large files (blobs) in alternate locations to that of the SharePoint content database without relinquishing access via the OM.  From an and developer standpoint, the data is accessed as if it were in the content database, but the content is actually in a remote location.

To enable RBS you’re farm will need to use at least SQL Server 2008 R2.

Marking blobs as external at the content database level enables SharePoint to store the meta-data associated with blobs in the database while storing the actual blob content outside the content database.  Because RBS is handled at the database level, SharePoint is unaware that data is not stored in the content database but in another location.

in a future time, vendors will bring RBS providers for SP2010 to the table, but in the meantime Microsoft has provided RBS for SQL server as an extra download:

http://go.microsoft.com/fwlink/?LinkId=177388

See my next blog post on configuring RBS.

 

From the SharePoint 2010 book I’m reviewing

If you see an HTTP 404 when accessing the /_vti_bin/ListData.svc WCF service in SharePoint 2010 then be sure to install the ADO.NET Data Services 1.5 CTP2

Link

Ever have a problem remembering something small, I do.  No matter how many projects I develop in SharePoint I cannot seem to remember how to reference the site collection URL from a user control.  So, after digging around for 5 minutes to find the answer I decided to blog it and save my aging memory:

<%$SPUrl:~SiteCollection/ %>

If you want to make sure that you aren’t using any of the restricted APIs before you deploy your solution to a sandbox environment, manually reference your project against:

[SharePoint Root]\UserCode\assemblies\Microsoft.SharePoint.dll  

If your code compiles, then you’re pretty safe!

NEVER DEPLOY code with this Microsoft.SharePoint.dll reference, instead reference the Microsoft.SharePoint.dll in

[SharePoint Root]\ISAPI folder.

From the SharePoint 2010 book I’m reviewing

If you find yourself in a situation when you need to attach to a process from VS 2010, to debug,  and you’re code is running in a “sandbox”.  You need to attach to the process SPUserCodeService.exe, not w3wp.exe.

SharePoint 2010 hosts sandbox code in a secure service – the user code service – so it can monitor resources and make sure the user’s code isn’t performing any action that could compromise the platform.

From the SharePoint 2010 book I’m reviewing

SP2010 Blogging Open

Just in case you missed it, public blogging of SharePoint 2010 is now permitted – let the flood gates open ;)

The SP2010 Developer Dashboard allows developers to review object model calls, database queries, web part events – and the timings for these various happenings.

The following code enables the dashboard:

SPPerformanceMonitor SPPerfMon;
SPPerfMon = SPFarm.Local.PerformanceMonitor;
SPPerfMon.DeveloperDashboardLevel = SPPerformanceMonitoringLevel.On;
SPPerfMon.Update();

The following code turns it off again:

SPPerformanceMonitor SPPerfMon;
SPPerfMon = SPFarm.Local.PerformanceMonitor;
SPPerfMon.DeveloperDashboardLevel = SPPerformanceMonitoringLevel.Off;
SPPerfMon.Update();

Some important points to remember when developing against SP2010:

  • Make sure your Visual Studio project is set up for .NET 3.5, not .NET 4.0
  • Run Visual Studio as an Administrator to load debugging symbols
  • Make sure your project is set to compile for Any CPU or x64 (not x86 by default), otherwise your code will throw a FileNotFoundException

From the SharePoint 2010 book I’m reviewing

Anyone who has created a new SP2010 Farm using the SharePoint Technologies Configuration Wizard will know that it makes a huge mess of SQL server, littering new databases with GUID extensions.

Those of you who have anal tendencies, like me, can sleep at night with nice clean database names with a script as follows:

Note: I am working on a script that will create Service Applications, which also like to create many GUID database instances.

@echo off

if "%1" == "" goto usage
if "%2" == "" goto usage

pushd C:\Program Files\Common Files\Microsoft Shared\Web Server Extensions\14\BIN
echo Creating New Farm Instance
psconfig.exe -cmd configdb -create -server %1 -database %2_SharePoint_Config -user DEV\MOSS_Farm -password Sharepoint3 -passphrase Sharepoint3 -admincontentdatabase %2_SharePoint_Admin_Content

echo Creating Central Admin Site
psconfig.exe -cmd adminvs -provision -port 8000 -windowsauthprovider onlyusentlm

echo Installing Services
psconfig.exe -cmd services install

echo Securing Resources
psconfig.exe -cmd secureresources

echo Starting Search Service
stsadm.exe -o osearch -action start -role Index -farmcontactemail support@portalsolutions.net -farmperformancelevel maximum -farmserviceaccount "DEV\MOSS_Farm" -farmservicepassword Sharepoint3

echo Starting WSS Search Service
stsadm.exe -o spsearch -action start -farmserviceaccount DEV\MOSS_Farm -farmservicepassword Sharepoint3 -farmcontentaccessaccount DEV\MOSS_Search -farmcontentaccesspassword Sharepoint3 -databaseserver %1 -databasename %2_WSS_App_Search

echo Installing Feature
psconfig.exe -cmd installfeatures

echo Installing Health Rules
psconfig.exe -cmd InstallHealthRules

popd
goto end

:usage
echo "SP2010-Provision.bat <SQLServer> <Project Acronym>"

:end

Search facets offer a powerful entry point into data exploration, especially in cases where data is categorized or tagged effectively.  With modern search and content retrieval mechanisms, such as the search engine in MOSS (and new FAST search engine for SharePoint) the traditional method of browsing for content in SharePoint using static navigation hierarchies can now take on a whole new approach.

The new thinking behind content storage is to put all artifacts in one large bucket, tag the artifacts, and then leverage search and facets to surface relevant content.  Think about how Google revolutionized mail by discarding the folder structure approach in favor of a search and label paradigm.

Anyone who has played with Faceted searching in SharePoint probably knows that, aside of the commercial tools like BA Insight, the only real free option is to use the Codeplex Faceted Additions.

fs3.png

The Codeplex offering assumes “post” search faceting, in that the web parts determine the relevant facet headings and count based on the current executed result set.  This approach makes good for filtering search results and allowing users to drill down on with restricted queries, but what about pre-facet browsing, similar to the functionality on sites like Best Buy?

Here is the problem – to provide the user with a dynamic tree view if hierarchical data based on facet categorization, the hierarchy generation method needs to know about all potential facet values ahead of time.  Take the following example:

An organization tags all their documents with a document type and department.  Let’s assume we wanted to provide a dynamic list of departments, which the user could choose, and then a list of document types available for the selected department.  After selecting the document type we’d like the user to see all documents of the selected type that sourced from the selected department.

Aside of issuing a general search, and then filtering the result set by department and document type, the Codeplex faceted search web parts do not appear to offer a mechanism to provide dynamic table-of-content like behavior.

So I got to thinking – search facets in MOSS are no more than managed properties that exist in the search index.  Surely the object model must enable me a way to query distinct values of a given managed property?  It turns out that you can query the search API for this information, and with a little code magic you can obtain the results desired:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Data;
using Microsoft.Office.Server.Search;
using Microsoft.SharePoint;
using Microsoft.Office.Server.Search.Query;

namespace SearchFacets
{
    /// <summary>
    /// Faceted search querying.
    /// </summary>
    class Program
    {
        static readonly string SRCURL = "http://server/";

        /// <summary>
        /// Entry point.
        /// </summary>
        /// <param name="args"></param>
        static void Main(string[] args)
        {
            using (SPSite srcSite = new SPSite(SRCURL))
            {
                string query = "SELECT DocumentType FROM Scope() WHERE \"SCOPE\" = 'My Scope Documents'";
                string[] values = GetDistinctSearchResults(srcSite, query, 1000);
                foreach (string s in values)
                    Console.WriteLine(s);
            }
        }

        /// <summary>
        /// Get some search results using full text.
        /// </summary>
        /// <param name="context">Site.</param>
        /// <param name="searchQuery">Query.</param>
        /// <param name="searchLimit">Limit results.</param>
        /// <returns>Results.</returns>
        static string[] GetDistinctSearchResults(SPSite context, string searchQuery, int searchLimit)
        {
            using (var fullTextQuery = new FullTextSqlQuery(context))
            {
                fullTextQuery.ResultTypes = ResultType.RelevantResults;
                fullTextQuery.QueryText = searchQuery;
                fullTextQuery.KeywordInclusion = KeywordInclusion.AnyKeyword;
                fullTextQuery.EnableStemming = false;
                fullTextQuery.TrimDuplicates = false;
                fullTextQuery.RowLimit = searchLimit;

                ResultTableCollection resultsCollection = fullTextQuery.Execute();
                ResultTable resultsTable = resultsCollection[ResultType.RelevantResults];
                return ReturnDistinct(resultsTable);
            }
        }

        /// <summary>
        /// Return distinct list.
        /// </summary>
        /// <param name="rtWins">Restult set.</param>
        /// <returns>Distinct values.</returns>
        static string[] ReturnDistinct(ResultTable rtWins)
        {
            DataTable dtWins = null;
            Dictionary<String, int> pairs = new Dictionary<string, int>();
            List<String> lstWins = new List<string>();
            dtWins = new DataTable("dtWINS");
            dtWins.Load(rtWins);

            foreach (DataRow drWin in dtWins.Rows)
            {
                string fieldName = drWin[0].ToString();
                if (pairs.ContainsKey(fieldName))
                    pairs[fieldName]++;
                else
                    pairs.Add(fieldName, 0);
            }

            foreach (KeyValuePair<String, int> pair in pairs)
                lstWins.Add(String.Format("{0} ({1})", pair.Key, pair.Value));
            return lstWins.ToArray();
        }
    }
}

You might be thinking “Hey, you’re just executing a search”, and you’d be right.  Since the facet values (managed property values map to crawled properties) live in the search indexes we have no choice but to perform a search to get at these values.

The key in the above code is to limit the search results returned (1000 in above case) and take advantage of relevancy.  In all likelihood; any search results beyond 1000 hits will not likely produce facet values that map to many results of value to the end user.

Clearly, the above code is just a starting point and has potential for many improvements, such as caching, making use of parent child relationships etc, but you get the idea…

« Newer Posts - Older Posts »

Follow

Get every new post delivered to your Inbox.