Data and AI Platform Engineer, Application Dev- C#.NET, Databricks Apps

Tuesday, February 24, 2015

SharePoint Pre-Upgrade/Migration Checklist

Here are excerpts published by Metalogix. It covers a lot of Pre-upgrade/migration scenarios that could help track down or prevent upgrade/migration issues. I have personally applied a lot of these in my projects in the past, especially environment clean-up activity, where we would ideally delete Orphan sites/site collections, unused sites, unused groups/user accounts, track down large site collections, break them down to smaller site collections, archive old records and apply database level maintenance plans.

Here are the key excerpts from the White Paper:

INTRODUCTION
No migration to SharePoint 2013 should be taken lightly. SharePoint content has grown at a remarkable rate of 75% annually. With that growth, your company’s users demand 100 percent up time and expect few or zero lapses in their ability to access business-critical sites. The IT department, meanwhile, must ensure that only the right people can access certain content while overseeing a smooth, error-free transition that will minimally impact their workload.

Its a simple message: “Clean up your current SharePoint environment today and planning out your future migration to SharePoint 2013 will save you time, money and headaches.” The complexity lies, however, in determining what to cleanup and how to plan for a successful migration.

Cleanup offers two major gains for SharePoint 2013 migration. First, it will increase the performance of your current environment by removing resource-intensive tasks. Second, it will expedite the overall time required to move from your current SharePoint environment to SharePoint 2013.

PRE-MIGRATION OF USERS AND GROUPS

No SharePoint environment would be complete without users. During the pre-migration analysis, identifying inactive and active users and groups not only helps you understand their post-migration expectations, it also gives administrators an opportunity to ask why these users are not using SharePoint for their collaboration needs. The result is less overhead and improved communication with your SharePoint users as you migrate to SharePoint 2013. The following sections explain how to identify these users

IDENTIFY INFLUENTIAL USERS

Including your influential users in the planning and implementation stages of your migration can prove beneficial to the success of the migration. Users can provide both “on-the-ground” feedback during the migration and communicate their SharePoint 2013 requirements and expectations.

Start by identifying the most active SharePoint users and sites in your environment by writing Powershell scripts. Communicate with these users throughout the process and help get early “buy-in” so that they will in turn communicate any changes or new features to their team. Some of the examples of Powershell scripts that SharePoint Admin/developer could write are Site Risk Assessment to gauge largest sites/site collections, content DB size, orphan sites, unused sites, large lists, complex workflows, orhpaned users/groups etc.

REMOVE ORPHAN USERS

One often overlooked area of the cleanup process is removing orphan users. These are employees who are no longer with the company and were removed from Active Directory but still have a SharePoint user account and associated permissions for sites or content.Orphan users may also have active SharePoint alerts or My Sites. While orphan users can no longer access the content, if they’ve created alerts, My Sites, and other mechanisms in your environment, those can impact SharePoint’s hardware resources. Cleaning out these users before your migration will help teams gain back resources and better predict SharePoint user growth.

CLEANUP UNUSED SHAREPOINT GROUPS

When new site collections are created, often the default SharePoint Groups (owners, contributors, etc.) are also created (intentionally or not). Frequently, only a few of those groups are used. This information is valuable as you plan for your migration and can help reduce the number of Client Access Licenses (CALs) needed and reduce your user footprint. It’s also an opportunity to reach out to inactive users who are still with the organization to ask why they’re not using SharePoint.

CLEANUP UNUSED SITES

Unused sites that contain almost no content or have very limited activity are excellent candidates for archiving as you prepare the migration process. You can find under-utilized sites within any scope (Site Collection, Web Application or even the whole farm) and filter the report to display certain site definitions.

BREAKUP LARGE SITE COLLECTIONS

Site Collections (or sites) often outgrow their original size. This suggests that users are actively engaged in SharePoint. It also points to a potential resource issue. Very large site collections can impact performance which in turn will likely hinder your migration plans.

Typically, it makes sense to split the Site Collection during the migration process. Or you may want to split the content so that some of your users can migrate to SharePoint 2013 and start using it, while other groups use SharePoint 2010.

FIND LARGE SITES (AND PROMOTE TO SITE COLLECTIONS)

As part of a migration plan, you might decide to reorganize your sites and promote some of them to site collections based on their size. For example, all sites above 100 GB might be candidates for promotion to Site Collection.

Microsoft has documented best practices for the size of Site Collections that should be followed. http://technet.microsoft.com/en-us/library/cc262787(v=office.15).aspx#SiteCollection

FIND SMALL SITE COLLECTIONS (AND DEMOTE TO SUBSITES)

Another opportunity for cleanup is to identify small site collections and demote them to subsites. Often user requests for sites are satisfied by creating site collections. Over time, it might become apparent that those site collections could or should have been created as subsites of a different site collection. Another outcome of this analysis might be that you choose to just delete those site collections since they have very little use.

IDENTIFY AND MANAGE LARGE LISTS

Microsoft’s best practices for numbers of items in a list or folder are often exceeded due to the unanticipated growth of SharePoint. Using the Most/Least Storage reports generated out of PowerShell, you can identify large lists with excessive number of items. During the migration you are then in a position to split those lists.

You can also use the List Property report to analyze the columns to determine whether they are approaching the Microsoft recommended list view thresholds (2,000 or more items with 20 or more columns).

CLEANUP CONTENT

With Powershell, you can generate the List and Library Storage report includes last accessed and/or last modified metadata. With that knowledge, you can choose to archive some of the content as part of the migration effort and/or the migration can be configured to skip older content that is no longer needed.

Another effective way to maintain control over content growth is to implement policies around versioning. Using Powershell scripts, you can gain insight into the storage impact of the document versions. The more versions an item has, the slower the migration of the item will be. With Powershell, you can use Set List Properties to configure versioning across any scope in the farm. The script can then configure SharePoint to automatically prune the versions from the content.

REMOVE DUPLICATE FILE

It’s not uncommon for SharePoint to be the repository for the same document in multiple places. To improve the migration performance and reduce user confusion, write a script to remove Duplicate Files

ARCHIVE THE AUDIT LOG

Powershell scripts can be written to move the Audit Log data from source content databases and stored in a standalone SQL database. This provides access to audit information even after you’ve moved to the new SharePoint version. Once the data is in the archive, you can use SQL Server Reporting Services or other tools can also be used to create audit reports after migration. This ensures a continuous audit trail even after the farm is migrated.

WEB PART DECISIONS

Web parts are used everywhere – your SharePoint environment probably includes Microsoft web parts, custom web parts, and third party web parts.

As you plan your migration, it’s important to get an accurate assessment of the impact of any custom web parts used in your environment. This step will help you determine whether to update a custom web part so it works in 2013 or deprecate the component as its functionality is no longer needed or has been replaced by a new capability in SharePoint 2013.

Web part usage analysis will also assist in determining if support/maintenance on third party web parts should be continued or not. We can write Powershell scripts to report out Webparts on the page.

Web Part analysis helps you can gain global insight into which web parts are being used and which sites they are being used on. The report can even filter for the Closed or Hidden web parts.

The Web Part report can be run by site administrators or distributed to them so that they can understand which components will be migrated and which will not. This will set end user expectations and prevent miscommunication during migrations.

WHERE ARE THE WORKFLOWS?

The Workflow repotrs could be generated via Powershell to find out details about workflow definitions, instances and their state. This information is particularly important for migration since it displays the details for all the in-progress workflows that cannot be migrated.

ARE THE CONTENT TYPES BEING USED

Content types are a powerful tool in SharePoint that enable organizations to organize, manage, and handle content in a consistent way across site collections. Gaining insight into the custom content types and where they are being used will help determine how the data will be managed once you migrate. Investigating the use of Content types will also help to highlight opportunities for creating enterprise content types that will provide broader scope.

Content Type report can be generated and filtered to show all content types or ones that are defined but not used. This information can be used to clean up the content types prior to the migration.

SHAREPOINT PERMISSIONS HIERARCHY

Using Powershell scripting reports, we can report out SharePoint sites/subsites that have broken inheritance and have unique set of permissions. This gives more insight on the complexity of permissions structure and also determines whether these sites should have ideally broken permission structure or not.

Alternatively you could use Metalogix tools known as ControlPoint and Content Matrix to report out all the above Pre-migration checklist parameters.

Wednesday, January 14, 2015

Developing Secure applications using Microsoft C#.NET - A primer

Here are some of the most important Security concepts/best practices on developing secure code using Microsoft C#.NET framework.

Threat Modeling:
a. Threat modeling involves understanding the complexity of the system and identifying all possible threats to the system.
b. These threats are analyzed based on their criticality and likelihood.
c. Decision is made whether to mitigate the threat or accept the risk associated with it.
d. Look at the system from the perspective of an adversary to help designers anticipate attack goals and determine answers to questions about what the system is designed to protect,and from whom.

Threat Modeling -Steps:
The threat modeling process consists of the following three high-level steps:
• Characterizing the system
• Identifying assets and access points
• Identifying threats.

Characterization:
• Characterizing the system involves understanding the system components and their interconnections, and creating a system model emphasizing its main characteristics.
• Then assets and access points of the system are identified.
• Identifying threats creates a threat profile of a system, describing all the potential attacks that need to be mitigated against or accepted as low risk.

Data Flow Diagram:
A data flow diagram (DFD) is a graphical representation of the "flow" of data through an information system, modeling its process aspects.
• DFDs can also be used for the visualization of data processing (structured design).
• A DFD shows what kind of information will be input to and output from the system, where the data will come from and go to, and where the data will be stored.
• It does not show information about the timing of processes, or information about whether processes will operate in sequence or in parallel.

Threat:
• The goal of this step is to identify threats to the system using the information gathered so far. A threat is the adversary’s goal, or what an adversary might try to do to a system
• Sometimes a threat is also described as the capability of an adversary to attack a system.
• The best method for threat enumeration is to step through each of the system’s assets, reviewing a list of attack goals.
• Assets and threats are closely correlated. A threat cannot exist without a target asset.

STRIDE:
• The output of threat identification process is a threat profile for a system, describing all the potential attacks, each of which needs to be mitigated or accepted. In general, threats can be classified into six classes based on their effect:
• Spoofing-Using someone else’s credentials to gain access to otherwise inaccessible assets.
• Tampering-Changing data to mount an attack.
• Repudiation-Occurs when a user denies performing an action, but the target of the action has no way to prove otherwise.
• Information disclosure-The disclosure of information to a user who does not have permission to see it.
• Denial of service-Reducing the ability of valid users to access resources.
• Elevation of privilege-Occurs when an unprivileged user gains privileged status.

Attack Trees:
• Attacks are modeled through the use of a graphical, mathematical, decision tree structure called an attack tree.
• Attack trees are constructed from the point of view of the adversary. Creating good attack trees requires that we think like an attacker.
• In an attack tree vulnerability model, the topmost (root) node represents an objective that would be of benefit to one or more threat agents.

Attack Tree:
Attack trees provide a formal, methodical way of describing the security of systems, based on varying attacks. You represent attacks against a system in a tree structure, with the goal as the root node and different ways of achieving that goal as leaf nodes.

Mitigation:
• Once all potential threats have been analyzed and the threats that must be mitigated have been determined, the next task is to specify security requirements. A requirement is a statement of goals that must be satisfied. A security policy can be considered a set of security requirements.
• A good security system strikes a balance between what is possible and what is acceptable via the risk management process.
• The simplest way to prioritize threats is by using two factors: damage and likelihood.

Risk Management:
There are four possible ways to manage a risk:
• Accept the risk -The risk is so low and so costly to mitigate that it is worth accepting.
• Transfer the risk -Transfer the risk to somebody else via insurance, warnings etc.
• Remove the risk -Remove the system component or feature associated with the risk if the feature is not worth the risk.
• Mitigate the risk -Reduce the risk with countermeasures.

The Three Pillars of .NET Security:
• You use authentication to ensure that you have correctly established the identity of a user.
• Once you have authenticated and identified a user, you authorize (authorization) the user to access one or more restricted resources.
• In terms of software security, when we trust someone, we grant that person access to one or more restricted resources;

.NET Security Namespaces
System.SecuritySystem.Security.Cryptography
System.Security.Cryptography.X509Certificates
System.Security.Cryptography.Xml
System.Security.Permissions
System.Security.Policy
System.Security.Principal

Assembly
An assembly contains Microsoft Intermediate Language (MSIL), metadata; an assembly contains the code that the common language runtime executes and description of the assembly itself. The .NET Framework uses the assembly as the basic unit of deployment and versioning and, most importantly as the basic security boundary. As such, assemblies are considered to be security surfaces.

Strong name
Strong names are used to protect assemblies –particularly shared assemblies.
• Simple name
• Version number
• Culture
• Encoding from the public / private key token

sn-k MyKeyPair.snk

Obfuscation
Obfuscation is the technique of altering the MSIL statements so that the application executes in the same way, but the output of a decompileris unreadable. Primary techniques
• Encryption
• Randomize symbolic names
• Application logic more difficult to follow
• The other alternative is writing security modules in native code.

CLR and Security
• Isolation with Application Domains
• Type safe code
• Unsafe code
• Verification
• Isolated storage
• Runtime security policy

CLR and Runtime Security Policy
• The runtime establishes the identity and grant set of an assembly when the assembly is loaded. You cannot alter the identity of an already loaded assembly to change the permissions granted to it.
• It inspects various characteristics of the assembly (evidence) to determine an identity for the code.
• The runtime uses the assembly’s evidence as input to a process named policy resolution.
• The result of policy resolution is a set of protected operations and resources to which the code within the assembly has access (permissions).

Evidence
The evidence objects you will most commonly use to define the identity of your assemblies are the seven standard evidence classes from the System.Security.Policynamespace:
• ApplicationDirectory
• Hash
• Publisher
• Site
• StrongName
• Url
• Zone

Permission request operations
• Requesting minimal permissions
• Requesting optional permissions
• Refusing permissions
• Permissions
• DnsPermission
• PrintingPermission
• SocketPermission
• FileIOPermission
• ReflectionPermission
• RegistryPermission
• UIPermission
• And more

Code Access Security Explained
The .NET Framework includes an important new feature known as code-access security (CAS), which provides fine-grained control over the operations and resources to which managed code has access. With CAS, access is based on the identity of the code, not the user running it and enforcing permission sets.

CAS is effectively another layer of security that managed code must pass through before it can interact with protected system resources, such as the hard disk and the Windows registry.

Permission sets
• Creating, deleting, and modifying files and directories
• Reading from and writing to the Windows registry
• Using sockets to accept or initiate a network connection
• Creating new application domains.
• Printing
• Code Access Security is the enforcement of security policies at the code level.
• Imperative security statements appear in the body of your methods and functions, which means they end up forming part of the compiled intermediate language (IL) code contained in your assembly.
• Declarative security statements are expressed using attributes, which are statements compiled to form part of an assembly’s metadata.

Imperative
public void CreateFile( ) {
// Create a new FileIOpermission object
FileIOPermissionperm = new FileIOPermission(
FileIOPermissionAccess.Write, @" C:\SomeFile.txt");
try {
// Demand the FileIOPermission
perm.Demand( );
}
catch (SecurityExceptionse) {
// Callers do not have necessary permission
}
// Method implementation...

Declarative
[FileIOPermission( SecurityAction.Demand,
Write = @" C:\SomeFile.txt"]
public void CreateFile( ) {
// Method implementation...
}

CLR
The .NET Framework provides a generic role-based security mechanism to represent the identity and roles of the user on whose behalf code is running.

.NET’s role-based security interfaces provide a standard mechanism through which you can make runtime security decisions based on the identity and roles of a user.
Namespace: System.Security.Principal

Roles
The process of determining what actions and resources a user has authority to access. A person’s authority is expressed in terms of roles. A role is a logical categorization that grants members of the role specific permissions.

An identity represents the user on whose behalf code is running. Normally, this is the currently logged-on Windows user, but this is not always the case. If your application authenticates users against an authority other than the Windows account system, then the identity may be different to the currently logged-on Windows user. A principal encapsulates an identity and the roles to which the identity belongs. Represented by the IPrincipaland IIdentity.

// Create a GenericIdentityfor the user Peter
GenericIdentitygi= new GenericIdentity(" Peter");

// Create a GenericPrincipalfor Peter and specify membership of the
// Developers and Managers roles
String[] roles = new String[]{" Developers", "Managers"}; GenericPrincipalgp= new GenericPrincipal( gi, roles);
// Assign the new principal to the current thread Thread.CurrentPrincipal= gp;

Impersonation
Sometimes, you will want your code to act at the operating system level as though it is a different user than the one currently logged on. This is particularly common in server applications that process requests from different users and need to access resources, such as databases, on behalf of the user.

[assembly:SecurityPermission( SecurityAction.RequestMinimum, UnmanagedCode= true)]
public class WindowsImpersonationTest{
[DllImport(" advapi32. dll", SetLastError= true)]
static extern intLogonUser(
String UserName, String Domain, String Password,
intLogonType, intLogonProvider, ref IntPtrToken);
public static void Main( ) {
IntPtrtoken = IntPtr.Zero;
intret = LogonUser(@" Bob", ".", "treasure", 2, 0, ref token)
if (ret = = 0) {
Console.WriteLine(" Error {0} occuredin LogonUser", Marshal.GetLastWin32Error( ));
return;
}
WindowsIdentitywi= new WindowsIdentity( token);
WindowsImpersonationContextimpctx= wi.Impersonate( );
StreamWriterfile = new StreamWriter(" test.txt");
file.WriteLine(" Bob's test file.");
file.Close( );
impctx.Undo( );

Cryptography
• Using cryptography in any application that processes sensitive or valuable data.
• Encryption works on the basis that there is one piece of information that the attacker has not been able to acquire, known as the key. The key is used as part of the encryption process and is kept secret. Sender selects an encryption algorithm and uses the secret key to create the encrypted data. When Receive gets the encrypted text, the secret key is used to restore the confidential message so that he can read it.

Types of Encryption
• Some types of encryption require Sender and Receiver to know the key and are called symmetric encryption (because they share the same knowledge). The problem with symmetric encryption is that both parties need to agree on what the secret key will be before sending any messages.
• Another approach is to use asymmetric encryption, where only Receiver has to keep a secret. Receiver creates a special pair of keys, one of which is kept secret (known as the private key ) and one that is given out to anyone who wants to send a message (the public key ). Receiver can provide the Sender his public key openly, because he does not care if someone intercepts it.

Integrity
• Integrity becomes an issue when Sender wants to send a message to Receiver but is concerned that someone will tamper with the message and change the contents. In this case, Sender does not care if anyone can read the message —they only want to make sure that the Receiver can detect any changes.
• The solution to this problem is to use a “keyed” hash code, which uses the contents of the message and a secret key to create the hash code. The Attacker can still modify the message, but can no longer create a valid hash code, because they lack the key used to create the original code.
• Unless threat is able to discover the key, they will be unable to create hash codes that will fool Receiver.

Digital Signatures
The goal of authentication is to allow Receiver to establish that Sender is the author of a message. For our purposes, this means that the Sender should be able to create a "digital signature” for the message and that Receiver should be able to check the signature to ensure that it is valid.

• Digital signatures rely on asymmetric encryption techniques, although they are applied differently than we discussed earlier. Sender creates a pair of keys, one made public and one of which kept private. To sign the message, the Sender creates a cryptographic hash code of the message that is sent to Receiver, as discussed earlier. The sender then signs the hash code using the private key. This creates a digital signature that is unique to the combination of the document.
• When Receiver gets the message, they verify the signature using Senders public key. If the signature is valid, then Sender has “signed” the message, and the Receiver can assume that a threat has not forged the message.

Creating a hash
[assembly:SecurityPermission( SecurityAction.RequestMinimum, UnmanagedCode= true)]
public class WindowsImpersonationTest{
[DllImport(" advapi32. dll", SetLastError= true)]
static extern intLogonUser(
String UserName, String Domain, String Password,
intLogonType, intLogonProvider, ref IntPtrToken);
public static void Main( ) {
IntPtrtoken = IntPtr.Zero;
HashAlgorithmx_hash_alg= HashAlgorithm.Create(" MD5"); // byte[] x_hash_code= x_hash_alg.ComputeHash( x_message_data); foreach(byte x_bytein x_hash_code) {
Console.Write("{ 0:X2} ",
x_byte);

Tuesday, October 28, 2014

Top 5 Daily SharePoint Disasters - White Paper

Metalogix has published an interesting White paper that talks about the Top 5 Daily SharePoint Disasters that SharePoint Administrators have to deal with almost on a daily basis while supporting Small or Medium or Large scale Enterprise Wide SharePoint Farms.

Here is the top 5 Daily Disasters that occur in a SharePoint farm and their recovery strategies:

1. Restoring Missing Documents:

Administrators are frequently tasked with addressing user requests to restore missing SharePoint content. While the introduction of Recycle Bin in SharePoint 2007 went a long way to reduce these requests, they still occur.

Oftentimes, a user can retrieve documents without the assistance of an administrator, if they act quickly. However, SharePoint’s Recycle Bins do not hold their contents forever. Most environments are configured to automatically flush their Recycle Bins for reasons of content size (i.e., too much in a bin) and age (i.e., a document is too old).

Documents can also disappear as a result of workflow and information management policies (such as retention policies). These may move documents between sites and site collections without a user’s knowledge. In some cases, these same mechanisms may delete documents entirely.

Multiple deletions from a document library is another problem. In these cases a user may not know which documents were deleted – only that they need to restore everything that went missing after at a particular time

Follow these steps to mitigate the “disappearing document” problem:

> Ensure that both first and second stages of the SharePoint Recycle Bin are enabled. Ideally, an item-level backup and restore strategy should be implemented to ensure that documents can be recovered in the event that they “disappear.”

> If you’re charged with restoring a document and the above stages weren’t yet enabled, you’ll need to identify compositional differences between the past and present states of the target document library. This is a less than ideal solution, for anything but the smallest of document libraries, this process can be extremely tedious and time consuming at best with conventional SharePoint administrative tools.

2. Restoring from Previous versions of Documents:

Document version recovery is a complex process and critical to users. Be sure to adopt a layered approach to ensure document version restores are simplified.

> Turn on versioning within document libraries.

> Enforce a document check-in/check-out to ensure only one user is working on a document at any given time.

Restoring documents that don’t exist in the form that matches user expectations is another common problem.

This often occurs when versioning for the list or library isn’t enabled – eachtime a user saves a document back to SharePoint, that document overwrites its previous version. However, versioning isn’t a fix-all solution. If you’re storage-conscious and limit the number of document versions that can be retained – then a complete document history may not be available. In fact, once the number of check-ins permissible by the version retention policy is reached for a given document, SharePoint deletes older versions of that document to accommodate newer versions.

Documents deleted as a result of retention limits do not go to the SharePoint Recycle Bin – they are permanently gone.

3. Corruption in a Content Database:

Content database corruption is problematic because it affects all users working with the site collections in that database. A SharePoint content database can house hundreds or even thousands of site collections. Having a content database fall out of circulation has the potential to affect most, if not all, users in a SharePoint environment.

Once a SharePoint environment is in-use, most administrators just create site collections as they are needed without giving much thought as to where those site collections go. Additional databases only get created as they are needed – typically when the “current” content database is reaching Microsoft’s maximum size (which varies from 200 GB to quite a bit larger depending on the performance of the underlying SQL Server).

Out-of-the-box high availability (HA) mechanisms generally do little to help with corrupted content databases. Some HA mechanisms, such as SQL Server’s mirroring and AlwaysOn Availability Groups, have the ability to repair inconsistencies that occur during mirroring or replication. If the source of content database corruption is external to the mirroring or replication mechanism, though, then HA simply guarantees “highly available corruption” – not a way to repair SharePoint content.

Corrupted Content Databases are a tricky problem with very few practical fixes.

> Perform regular content database backups.

> As with all backup regimens, test frequently to ensure that you can restore as desired!

4. Deleted Site Collection:

It’s quite common for users to unintentionally delete SharePoint site collections or sub-sites.

Microsoft recognized the prevalence of this problem during the SharePoint 2010 timeframe and introduced the Site Recycle Bin as part of Service Pack 1. The Site Recycle Bin extended standard Recycle Bin support to include site collections and sub-sites that were deleted. Now, although end users can restore deleted sub-sites by themselves from within the SharePoint web user interface, you’ll still need to perform an administrative action to restore an entire site collection using the Site Recycle Bin.

This is due to the fact that site collections can’t be restored from within the web user interface or even Central Administration. Instead, restoring a site collection for the Site Recycle Bin has to be done with PowerShell (specifically, the Get- SPDeletedSite and Restore-SPDeletedSite cmdlets) on your SharePoint member server.

To prevent and mitigate the problem of deleted site collections:

> Ensure that SharePoint’s Recycle Bins are configured and enabled to catch deleted site collections.

> Implement a solid backup and restore strategy for site collections and/or their associated databases.

5. Deleted Permissions settings:

The deletion of unique permissions applied to SharePoint lists or libraries by absent-minded users is a tricky one to reverse, since SharePoint’s built-in data protection toolset offers no options for restoring permissions.

The only option is to either manually re-apply all of the custom permissions to items in the list, or delete the list and attempt to import a known “good copy” that was exported from a backup set. Previously executed tests will provide much needed confidence in any restore action. The former choice is especially time consuming, while the latter results in the loss of content modifications since the last backup. Either case is far from ideal because a compromise must be made.

Without the right out-of-the-box tools in place, a site collection (or content database) backup/restore is needed.

Monday, October 27, 2014

Best Practices for ASP.NET MVC

There is a nice write-up on the MSDN Blog site dedicated to Best Practices for ASP.NET Model-View-Controller pattern.

Read the full article at MSDN Blog: http://blogs.msdn.com/b/aspnetue/archive/2010/09/17/second_2d00_post.aspx

Wednesday, October 15, 2014

Optimize SharePoint Performance

Eric Shupps, a renowned Microsoft MVP has written an excellent White Paper published by Metalogix for optimizing SharePoint performance. I found the tips valuable and agree with Eric's suggestions as I have personally found incorporating a lot of them extremely helpful in improving SharePoint performance.

STEP 1 SEPARATE USER AND DATABASE TRAFFIC

A common misconception is that servers connected to a high-speed network segment (such as gigabit Ethernet or Fiber) will have plenty of bandwidth to perform all required operations. While it is unlikely that a high-speed link will ever be fully saturated at any given moment, it is quite possible that physical resource contention may have a direct impact on the performance of a specific operation. SharePoint places a tremendous amount of demand on SQL—each request for a page can result in numerous calls to the database, not to mention the additional overhead required by service jobs, search indexing, and other operations.

In order to mitigate the conflict between user and database traffic, connectivity between front-end servers and SQL should be isolated, either via separate physical networks or virtual LAN’s. Typically, this requires at least two separate network interface cards in each front end web server with static routes configured to insure traffic is routed to the correct interface. The same configuration may also be applied to application and index server, although the perceived benefits will be less as these server roles typically do not involve much direct user traffic (environments which utilize InfoPath Forms Services and Excel Services will benefit the most).

The following is an example of a typical configuration optimized for data connectivity. In this scenario, Servers A functions as a Web Front End, while B performs index operations, C is an Active Directory domain controller and D hosts the SQL database:

An important issue that must be taken into account when considering the separation of WFE/Index servers and the SQL database is the use of firewalls. Network security policies often dictate the isolation of database servers behind a firewall which restricts communication to only a handful of necessary ports. Caution should be taken in this type of configuration to insure that the firewall ports operate at sufficient speed to insure optimal data transfer and that only the minimal amount of required filtering and packet analysis are being performed. SharePoint applications are very data intensive and the database connection can often be the most significant bottleneck; whenever this connection is impeded by a firewall, thorough load and scalability testing should be performed to assess the impact on portal operations under heavy use.

STEP 2 ISOLATE SEARCH INDEXING

A typical medium server farm consists of one or more web front end servers, a dedicated index or application server and a separate SQL database server. In this scenario, search traffic initiated by the index server must be processed by the same servers responsible for delivering end-user content. The crawl process can create a tremendous amount of network traffic as the Search service must make a number of HTTP requests for each page in the hierarchy. Failure to isolate this traffic can have a negative impact on performance as search requests conflict with user traffic competing for the same limited amount of physical bandwidth.

In order to prevent search and user traffic from conflicting with each other, an additional server may be added to the farm which is dedicated solely to servicing search queries (alternatively, in smaller environments, the index server may also serve this function). The farm administrator would then configure the Search service to perform crawls only against this dedicated server, thereby eliminating excessive load on the web front end servers during index operations and reducing the amount of potentially conflicting requests. Depending upon the scope and frequency of search crawls, this configuration can reduce traffic to the web front end servers by as much as 70% during index operations.

STEP 3 ADJUST SQL PARAMETERS

SQL performance tuning is a discipline unto itself but there are some simple things that SharePoint farm administrators can do to improve performance without requiring the services of an experienced DBA. To begin with, the implementation of SQL should be planned at least as carefully as the SharePoint farm itself; perhaps more, considering the level of detail that can be involved. Physical hardware resources, network connectivity, disk size and speed, location of data files, configuration of shared storage—all aspects must be taken into consideration based on the size of the farm and the projected amount of data.

One quick way to avoid future headaches is to provision the major SharePoint databases onto separate physical disks (or LUNs if a SAN is involved). This means one set of disks for search databases, one for temporary databases, and still another for content databases (depending upon the size of the individual content databases these may require further separation). SharePoint is both read and write intensive, so separating the I/O operations onto separate disks prevents excessive thrashing and cache hits. Additional consideration should be given to isolating the log files (*.ldf); although these do not incur the same level of I/O as other files, they do play a primary role in backup and recovery and, because they can grow to several times the size of the master database files, can consume a great deal of disk space.

Another simple optimization technique is to proactively manage the size and growth of individual databases. By default, SQL grows database files in small increments, either 1MB at a time or as a fixed percentage of database size (usually 10%). These settings can cause SQL to waste cycles constantly expanding databases, especially in larger environments which utilize a great deal of storage space, and prevents further data from being written while the databases are expanding. An alternative approach is to first pre-size the databases up to the maximum recommended size (100GB) if space is available and set autogrowth to a fixed size (e.g. 10MB or 20MB). This will prevent SQL from expanding databases unnecessarily and insure that growth happens in a manageable fashion.

Finally, autogrowth, and its corollary, autoshrink, are prone to producing excessive fragmentation, especially when done in small increments. Even on fast disks fragmentation can have a substantially negative impact on performance, a situation which may be compounded by complex RAID configurations and distributed SAN storage. Disk defragmentation should be scheduled on a frequent basis to insure that SQL is using resources effectively.

STEP 4 DEFRAGMENT DATABASE INDEXES

SQL Server maintains its own set of indexes for data stored in various databases in order to improve query efficiency and read operations. Just as with files stored on disk, these indexes can become fragmented over time as new INSERT, DELETE and UPDATE operations are performed. This can have a particular effect on SharePoint as many of the behind-the-scenes query operations are not optimized for large datasets (such as list views encompassing hundreds of thousands of items).

Index fragmentation is a complex topic best left to SQL DBA’s but it is important for administrators to plan for regular maintenance operations which include index defragmentation. Special care should be taken to schedule these types of operations in a maintenance window as they are both resource-intensive and, in many cases, blocking tasks which prevent data from being written to or read from the indexes as they are being rebuilt. Service Pack 2 for SharePoint Server 2007 and WSS v3.0 contains updates to the database index management routines functionality and scheduling but enterprise administrators who manage large farms may wish to implement their own index defragmentation maintenance tasks and schedules.

STEP 5 DISTRIBUTE USER DATA ACROSS MULTIPLE CONTENT DATABASES

Most SharePoint data is stored in lists: Tasks, Announcements, Document Libraries, Issues, Picture Libraries, and so forth. A great deal of this data is actually stored in a single table in the content database associated with the site collection. Regardless of how many sites and subsites are created within the SharePoint hierarchy, each Site Collection has only one associated Content Database. This means that a Site Collection with thousands of subsites is storing the bulk of the user data from every list in every site in a single table in SQL.

This can lead to a delay in the time it takes to insert items, render views, and navigate folders in any given list, as SQL must recursively execute the various queries over one potentially very large dataset. One way to reduce the workload is to manage the mapping of site collections to content databases. Administrators can use the Central Administration interface to pre-stage content databases to insure that site collections are associated with a single database or grouped logically based on size or priority. By adjusting the Maximum Number of Sites setting or changing database status to ‘offline’, administrations can also control which content database is used when new site collections are created.

STEP 6 MINIMIZE PAGE SIZE

The SharePoint user interface is intended to make it easy for information workers to manage content and find resources. For users connected to the portal over the corporate LAN this doesn’t pose much of a problem but for disconnected users on slower WAN links or public facing websites the heavyweight nature of the typical SharePoint page can prove to be a real performance-killer. Fortunately, there are a number of ways to reduce the bloat and trim those pages down to a more manageable size.

First, when designing a new look and feel, it is helpful to start with a minimal master page. The master page contains all the chrome and navigation elements typically found on the top and left of a standard SharePoint page—things like the search box, global navigation, Quick Launch, Site Actions menu and other visual element. A minimal master page, as the name implies, removes unnecessary elements and allows designers to start with a clean slate that only contains the base functionality required for the page to render correctly.

Second, most SharePoint pages contain links to supporting files, including javascript and style sheets, which require additional time to retrieve and execute. Designers can alter the way in which SharePoint pages retrieve these files through a technique called “delayed loading”, which essentially loads the linked files in the background while the rest of the page is rendering, allowing users to view content without waiting for all the back-end processing to take place.

Finally, there are programmatic ways to reduce page sizes in SharePoint which developers can implement relatively easily. One technique is the elimination of whitespace—dead weight which makes the page more readable to developers and designers but which has no impact on the ability of a client browser to render the page. Another method is to remove specific javascript and authoring content entirely based upon a user’s permission level. Developers may also wish to disable ViewState persistence for specific page objects and implement custom server controls which leverage caching mechanisms that are more efficient than the default cache objects in SharePoint.

STEP 7 CONFIGURE IIS COMPRESSION

SharePoint content consists of two primary sources—static files resident in the SharePoint root directories (C:\Program Files\Common Files\Microsoft Shared\12 for 2007 and \14 for 2010) and dynamic data stored in the content database (web parts, publishing field controls, navigation controls, etc.). At runtime, SharePoint merges the page contents from both sources prior to transmitting them inside an HTTP response to the requesting user. Internet Information Server versions 6 (Windows Server 2003) and 7 (Windows Server 2008) both contain various mechanisms for reducing the payload of HTTP responses prior to transmitting them across the network. Adjusting these settings can reduce the size of the data transmitted to the client, resulting in shorter load times and faster page rendering. It is important to note that doing so can have an impact on CPU load and memory utilization – care should be taken to insure that sufficient hardware resources exist before tuning IIS compression settings.

IIS compression settings can be modified from a base value of 0 (no compression) to a maximum value of 10 (full compression). Adjusting this setting determines how aggressive IIS should be in executing the compression algorithms. Setting the value to 4 can result in a static linked file such as core.js being reduced in size from 79k

to 60k, resulting in a payload reduction of 24%. Turning it up to 9 will reduce the size even further to 57k (it is generally recommended to avoid using the max setting of 10 as this can place excessive demands on the CPU). Environments which employ extensive custom branding and customization, and those with a large proportion of remote users on slower connections, will benefit the most from adjusting IIS compression levels.

STEP 8 TAKE ADVANTAGE OF CACHING

SharePoint serves up a lot of data but not all of it is retrieved in real-time from the database. Much of the content requested by users can be cached in memory, including list items, documents, query results and web parts. How, when and by what rules SharePoint caches content is governed by cache profiles which can be configured for a single site or an entire site collection. Each cache profile is comprised of settings which tell the system which content to cache, who to cache it for, when to release it and when to check for updated content.

Site administrators can configure their own cache profiles to meet different user needs. Anonymous users, for example, can be assigned one set of cache policies while authenticated users are assigned another, allowing content editors to get a more recent view of content changes than general readers. Cache profiles can also be configured by page type, so publishing pages and layout pages behave differently, and administrators have the option to specify caching on the server, the client, or both.

In addition, the SharePoint Object Cache can significantly improve the execution time for resource-intensive components (such as the Content Query Web Part). By determining in advance whether or not to check for new query results on each request or to use a set of previous results stored in memory, queries which cross site boundaries can be prevented from making excess demands on the database. Large objects which are requested frequently, such as images and files, can also be cached on disk for each web application to improve page delivery times.

STEP 9 MANAGE PAGE CUSTOMIZATIONS

SharePoint Designer is a useful tool for administrators and power users but it has a side-effect which can be harmful to overall performance: page customizations (or, in legacy parlance, ‘unghosting’). A customized page is one that has been opened in Designer, modified and saved. This has the effect of breaking the coupling between the pages on disk in the SharePoint Root folders and data in the content database. When customization occurs, the entire page content, including the markup and inline code, is stored in the database and must be retrieved each time the page is requested. In most cases, this introduces relatively little additional overhead on a page-by-page basis, but in larger environments with hundreds or even thousands of pages, all that back-and-forth to the database can add up to significant performance degradation.

To prevent this problem, administrators should implement a policy which restricts page customizations to only those situations where it is absolutely necessary. It may be fine, for example, for select contributors to customize user-authored layout pages in the /pages system library, while master and default layout pages are strictly off-limits. Also, it may be necessary to set up a ‘sandbox’ environment where users can create custom web parts which can then be imported into production without having to open controlled pages directly in SharePoint Designer. Site Collection and Farm administrators also have the option of completely disabling the use of Designer or, when necessary, using the “Reset to Site Definition” option to undo changes and revert back to the original content.

STEP 10 LIMIT NAVIGATION DEPTH

One of the most significant design elements on any portal site is the global, drop-down, flyout menu at the top of each page. It seems like a handy way to navigate through all the various sites and pages—until it becomes so deep and cluttered that all ability to navigate beyond the first few levels is lost completely. Even worse, fetching all the data to populate the navigation menus can be resource-intensive on sites with deep hierarchies.

SharePoint designers have the ability to customize the depth and level of each navigation menu by modifying the parameters for the various navigation controls within the master page. When using dynamic navigation, SharePoint builds a site hierarchy in the background and caches this information, helping to reduce the amount of data retrieved from the database, and stores it in an XML file. The size of this file, cached or otherwise, can impact the ability of pages to render in a timely manner—the controls must still be populated with all of the nodes in the hierarchy.

Administrators should limit the depth of navigation to a manageable level which is both usable and constrained to a number of sites that does not impact performance. Furthermore, if the Pages library contains a large number of documents and the Pages option is selected in navigation settings, rendering the current navigation elements can also have a negative impact on performance; either exclude Pages from navigation or limit the amount of documents in this system library.

CONCLUSION

SharePoint offers a tremendous array of options for improving performance and efficiency of collaborative portal applications. Understanding the inner workings of the framework and its dependencies upon key infrastructure components, including servers, network connections, services, operating system components and databases, is vital to the success of any optimization strategy. Implementing basic performance enhancements as outlined in this paper can provide tremendous improvements to the operation and performance of SharePoint within the enterprise.