Thursday, June 29, 2017

Remove images and internal links from html content via regular expressions in PowerShell

Suppose that we need to export intranet news (e.g. to the file system) which then will be read by other tool which will show these news to external users or in internet. Of course images and internal links won’t work in this case because intranet is not accessible from outside of organization network. One of solution is to remove them from exported news. Here are 2 functions which remove images and internal links from html content using regular expressions:

   1: function RemoveImages($str)
   2: {
   3:     if ([System.String]::IsNullOrEmpty($str))
   4:     {
   5:         return "";
   6:     }
   7:     $re = [regex]"<img.+?/>"
   8:     return $re.Replace($str, "[image deleted]")
   9: }
  10:  
  11: function RemoveInternalLinks($str)
  12: {
  13:     if ([System.String]::IsNullOrEmpty($str))
  14:     {
  15:         return "";
  16:     }
  17:     
  18:     $matchEvaluator =
  19:     {
  20:         param($m)
  21:         
  22:         if ($m.Groups.Count -eq 2 -and $m.Groups[1].Success -and
  23:             ($m.Groups[1].Value.ToLower().Contains("myintranet.com") -or
  24:                 $m.Groups[1].Value.StartsWith("/")))
  25:         {
  26:             return "[link deleted]";
  27:         }
  28:         return $m.Groups[0].Value;
  29:     }
  30:     
  31:     $re = [regex]"<a.+?href=['""](.+?)['""].*?>.+?</a>"
  32:     return $re.Replace($str, $matchEvaluator)
  33: }

If we will use these functions for the following html:

Some text <img src=”http://example.com/someimage.png” />, internal links <a href=”http://myintranet.com”>test1</a> and <a href=”/subsite”>test2</a>, external link <a href=”http://example.com”>test3</a>

it will be transformed to the following:

Some text [image deleted], internal links [link deleted] and [link deleted], external link <a href=”http://example.com”>test3</a>

Note that both absolute and relative internal links are removed. It is done by conditional regular expression replace (lines 18-32) which removes links only if their href attribute contains server name of intranet (myintranet.com in our example) or if it starts with slash / which means relative link. And external link remains in resulting html. Hope that this information will help someone.

Wednesday, June 21, 2017

Get folder of list item using javascript object model in Sharepoint

Suppose that we need to get folder (SP.Folder) where specific list item (SP.ListItem) is located. The following code shows how to do that:

   1: var ctx = SP.ClientContext.get_current();
   2: var file = item.get_file();
   3: ctx.load(file);
   4: ctx.executeQueryAsync(
   5:     Function.createDelegate(this, function (sender, args) {                                
   6:         var folderUrl = file.get_serverRelativeUrl().substring(0,
   7:             file.get_serverRelativeUrl().lastIndexOf("/"));
   8:         var folder = ctx.get_web().getFolderByServerRelativeUrl(folderUrl);
   9:         ctx.load(folder);
  10:         ctx.executeQueryAsync(
  11:             Function.createDelegate(this, function (sender, args) {
  12:                 ...
  13:             }),
  14:             Function.createDelegate(this, function (sender, args) {
  15:                 console.log(args.get_message());
  16:             }));
  17:     }),
  18:     Function.createDelegate(this, function (sender, args) {
  19:         console.log(args.get_message());
  20:     }));

Here at first we load file (SP.File) (lines 2-3), then get relative url (lines 6-7) and by this relative url get folder (lines 8-9). JSOM documentation says that there is SP.ListItem.folder property available, but it always returned error for some reason. May be it will be fixed in future updates.

Monday, June 19, 2017

Problem with Sharepoint NTLM authentication and nginx proxy

Some time ago we faced with the following problem: on-premise Sharepoint 2013 site has 2 authentication zones: Default and Custom. Default authentication zone uses NTLM authentication while Custom uses FBA. Both zones have own host headers (e.g. windows.example.com for Default zone and fba.example.com for Custom). Both host headers were specified in Alternate access mappings of appropriate web application in Sharepoint central administration.

For accessing Sharepoint site remotely nginx was used as reverse proxy between client and internal Sharepoint farm. With this configuration FBA url worked both from within Sharepoint farm (from RDP session) and remotely while Windows url worked only from Sharepoint server and only if we bypass nginx by specifying windows.example.com in hosts file and pointing it to 127.0.0.1 (self IP address). All attempts to login through nginx failed with 401 Unauthorized (and I mean login using custom host header. Logins from RDP session via serve’s name worked).

Investigation showed that nginx doesn’t works well with NTLM authentication (see e.g. How to enable windows authentication through a reverse proxy), so at the end we got rid from nginx in between and configured access to Sharepoint via IP table. If you have solution which works with nginx please share it. Anyway I hope that this information will be helpful.

Monday, June 5, 2017

Perform CAML queries with managed metadata fields to Sharepoint lists via javascript object model

In this post I will show how to perform CAML queries which contain conditions with managed metadata (taxonomy) fields via javascript object model (JSOM). Suppose that we have custom list with 2 fields:

  1. Location – managed metadata
  2. Path – single line text which contains url for specific location

We need to query this list using location value and get path for this specific location. Here is the javascript code which can be used for that:

   1: SP.SOD.executeFunc("sp.js", "SP.ClientContext", function () {
   2:     try {
   3:         if (typeof (location) == "undefined" || location == null ||
   4:             location == "") {
   5:             return;
   6:         }
   7:  
   8:         var ctx = new SP.ClientContext("http://example.com");
   9:         var list = ctx.get_web().get_lists().getByTitle("Locations");
  10:         var query = new SP.CamlQuery();
  11:         query.set_viewXml(
  12:             '<View>' +
  13:                 '<Query>' +
  14:                     '<Where>' +
  15:                         '<Contains>' +
  16:                           '<FieldRef Name=\'Location\'/>' +
  17:                           '<Value Type=\'Text\'>' + location + '</Value>' +
  18:                         '</Contains>' +
  19:                     '</Where>' +
  20:                 '</Query>' +
  21:                 '<RowLimit>1</RowLimit>' +
  22:             '</View>');
  23:  
  24:         var items = list.getItems(query);
  25:         ctx.load(items, 'Include(Path)');
  26:         ctx.executeQueryAsync(
  27:             Function.createDelegate(this, function (sender, args) {
  28:                 var path = "";
  29:                 var enumerator = items.getEnumerator();
  30:                 while (enumerator.moveNext()) {
  31:                     var item = enumerator.get_current();
  32:                     path = item.get_item("Path").get_url();
  33:                     break;
  34:                 }
  35:  
  36:                 console.log("Path: " + path);
  37:             }),
  38:             Function.createDelegate(this, function (sender, args) {
  39:                 console.log(args.get_message());
  40:             }));
  41:  
  42:     } catch (ex) {
  43:         console.log(ex.message);
  44:     }
  45: });

Code is self-descriptive so I won’t detailed explain what it does. The only moment to notice is that in order to get list item by taxonomy value in Location field we use Contains operator and pass term label to the query (lines 12-22). After that we just iterate through returned items (in this example we set RowLimit to 1, but in your scenarios you can of course get many items) and read Path field value. In order to be able to access Path field we included it to result set (line 25).