- Research
- Open Access
Sequence spaces \(M(\phi)\) and \(N(\phi)\) with application in clustering
- Mohd Shoaib Khan^{1},
- Badriah AS Alamri^{2},
- M Mursaleen^{2, 3}Email author and
- QM Danish Lohani^{1}
https://doi.org/10.1186/s13660-017-1333-z
© The Author(s) 2017
- Received: 14 November 2016
- Accepted: 14 February 2017
- Published: 21 March 2017
Abstract
Distance measures play a central role in evolving the clustering technique. Due to the rich mathematical background and natural implementation of \(l_{p}\) distance measures, researchers were motivated to use them in almost every clustering process. Beside \(l_{p}\) distance measures, there exist several distance measures. Sargent introduced a special type of distance measures \(m(\phi)\) and \(n(\phi)\) which is closely related to \(l_{p}\). In this paper, we generalized the Sargent sequence spaces through introduction of \(M(\phi)\) and \(N(\phi)\) sequence spaces. Moreover, it is shown that both spaces are BK-spaces, and one is a dual of another. Further, we have clustered the two-moon dataset by using an induced \(M(\phi)\)-distance measure (induced by the Sargent sequence space \(M(\phi)\)) in the k-means clustering algorithm. The clustering result established the efficacy of replacing the Euclidean distance measure by the \(M(\phi)\)-distance measure in the k-means algorithm.
Keywords
- clustering
- double sequence
- k-means clustering
- two-moon dataset
MSC
- 40H05
- 46A45
1 Introduction
Clustering is a well-known procedure to deal with an unsupervised learning problem appearing in pattern recognition. Clustering is a process of organizing data into groups called clusters so that objects in the same cluster are similar to one another, but are dissimilar to objects in other clusters [1]. The main contribution in the field of clustering analysis was the pioneering work of MacQueen [1] and Bezdek [2]. They had introduced highly significant clustering algorithms such as k-means [1] and fuzzy c-means [2]. Among all clustering algorithms, k-means is the simplest unsupervised clustering algorithm that makes use of a minimum distance from the center, and it has many applications in scientific and industrial research [3–6] (for more information about the k-means clustering algorithm, see Section 5). K-means algorithm is distance dependent, so its outputs vary with changing distance measures. Among all distance measures, a clustering process was usually carried out through the Euclidean distance measure [7], but many times it failed to offer good results. In this paper, we define \(M(\phi)\)- and \(N(\phi)\)-distance measure. Further, \(M(\phi )\)-distance is used to cluster two-moon dataset. The output result is compared with the result of Euclidean distance measure to show the efficacy of \(M(\phi)\)-distance over the Euclidean distance measure. \(M(\phi )\) and \(N(\phi)\)-distance measures are the generalization of \(m(\phi)\)- and \(n(\phi)\)-distance measures introduced by Sargent [8] and further studied by Mursaleen [9, 10] (to know more about \(m(\phi )\) and \(n(\phi)\), refer to [8–10]). The \(M(\phi)\) and \(N(\phi)\) spaces are closely related to \(l_{p}\) distance measures. \(l_{p}\) measures and its variance are mostly used to solve the problems evolving in the fields of Market prediction [11], Machine Learning [12], Pattern Recognition [13], Clustering [20] etc.
Throughout the paper, by ω we denote the set of all real or complex sequences. Moreover, by \(l_{\infty}\), c and \(c_{0}\) we denote the Banach spaces of bounded, convergent and null sequences, respectively; and let \(l_{p}\) be the Banach space of absolutely p-summable sequences with p-norm \({\Vert \cdot \Vert } _{p}\). For the following notions, we refer to [14, 15]. A double sequence \(x = (x_{jk})\) of real or complex numbers is said to be bounded if \(\Vert x \Vert _{\infty} < \infty\), the space of all bounded double sequences is denoted by \(\mathcal{L}_{\infty}\). A double sequence \(x = (x_{jk})\) is said to converge to the limit L in Pringsheim’s sense (shortly, convergent to L) if for every \(\varepsilon> 0\), there exists an integer N such that \(\vert x_{jk} - L\vert < \varepsilon\) whenever \(j,k > N\). In this case L is called the p-limit of x. If in addition \(x \in\mathcal{L}_{\infty}\), then x is said to be boundedly convergent to L in Pringsheim’s sense (shortly, bp-convergent to L). A double sequence \(x = (x_{jk})\) is said to converge regularly to L (shortly, r-convergent to L) if x is p-convergent and the limits \(x_{j}: = \lim_{k}x_{jk}\) (\(j \in\mathbb {N} \)) and \(x^{k}: = \lim_{j}x_{jk}\) (\(k \in\mathbb{N} \)) exist. Note that in this case the limits \(\lim_{j}\lim_{k}x_{jk}\) and \(\lim_{k}\lim_{j}x_{jk}\) exist and are equal to the p-limit of x. In general, for any notion of convergence ν, the space of all ν-convergent double sequences will be denoted by \(\mathcal{C}_{\nu}\) and the limit of a ν-convergent double sequence x by \(\nu\textrm{-} \lim_{j,k}x_{jk}\), where \(\nu\in\{ p,\mathit{bp},r\}\).
Remark 1.1
- (i)
The spaces \(m(\phi)\) and \(n(\phi)\) are BK-spaces with their usual norms.
- (ii)
If \(\phi_{n} = 1\) (\(n = 1,2,3,\ldots\)), then \(m(\phi) = l_{1}\) [\(n(\phi) = l_{\infty} \)], and if \(\phi_{n} = n\) (\(n = 1,2,3,\ldots\)), then \(m(\phi) = l_{\infty}\) [\(n(\phi) = l_{1} \)].
- (iii)
\(l_{1} \subseteq m(\phi) \subseteq l_{\infty} \) [\(l_{1} \subseteq n(\phi) \subseteq l_{\infty} \)] for all \(\phi\in\Phi\).
- (iv)
For any \(\phi\in\Phi\), \(m(\phi)\neq l_{p}\) [\(n(\phi)\neq l_{q} \)], \(1 < p < \infty\).
Remark 1.2
If \(\phi_{st} = 1\) (\(s,t = 1,2,3,\ldots\)), then \(M(\phi) = L_{1}\) [\(N(\phi) = L_{\infty} \)], and if \(\phi_{st} = st\) (\(s,t = 1,2,3,\ldots\)), then \(M(\phi) = L_{\infty}\) [\(N(\phi) = L_{1}\)].
We now state the following known results of [18] for single sequences (series) which can also be proved easily for double sequences (series).
Lemma 1.1
If the series \(\sum u_{n}x_{n}\) is convergent for every x of a BK-space E, then the functional \(\sum_{n = 1}^{\infty} u_{n}x_{n}\) is linear and continuous in E.
Lemma 1.2
2 Properties of the spaces \(M(\phi)\) and \(N(\phi)\)
Theorem 2.1
Proof
Remark 2.1
Lemma 2.1
- (i)
If \(x \in M(\phi)\) [\(x \in N(\phi) \)] and \(u \in S(x)\), then \(u \in M(\phi)\) [\(u \in N(\phi) \)] and \(\Vert u\Vert = \Vert x\Vert \).
- (ii)
If \(x \in M(\phi)\) [\(x \in N(\phi)\)] and \(\vert u_{mn}\vert \le \vert x_{mn}\vert \) for every positive integer m, n, then \(u \in M(\phi)\) [\(u \in N(\phi)\)] and \(\Vert u\Vert \le \Vert x\Vert \).
Proof
(ii) By using the definition, easy to prove. □
Theorem 2.2
For arbitrary \(\phi\in\Theta\), we have \(\Delta_{11}\phi\in M(\phi)\) and \(\Vert \Delta_{11}\phi \Vert _{M(\phi)} \le2\).
Proof
Lemma 2.2
Proof
Theorem 2.3
Proof
Theorem 2.4
In order that \(\sum u_{mn}x_{mn}\) be convergent [absolutely convergent] whenever \(x \in N(\phi)\), it is necessary and sufficient that \(u \in M(\phi)\).
Proof
3 Inclusion relations for \(M(\phi)\) and \(N(\phi)\)
Lemma 3.1
Proof
Theorem 3.1
- (i)
\(L_{1} \subseteq M(\phi) \subseteq L_{\infty}\) [\(L_{1} \subseteq N(\phi) \subseteq L_{\infty} \)] for all ϕ of Θ.
- (ii)
\(M(\phi) = L_{1}\) [\(N(\phi) = L_{\infty} \)] if and only if \(\mathit{bp}\textit{-}\lim_{s,t}\phi_{st} < \infty\).
- (iii)
\(M(\phi) = L_{\infty}\) [\(N(\phi) = L_{1}\)] if and only if \(\mathit{bp}\textit{-}\lim_{s,t}(\phi_{st}/st) > 0\).
Proof
We prove here the first version, while the second version follows by Theorems 2.3 and 2.4. Since \(\phi_{11} \le\phi_{mn} \le mn\phi_{mn}\) for all ϕ of Θ, we have by Lemma 3.1 that (i) is satisfied. Further, from Lemma 3.1, it follows that \(M(\phi) \subseteq L_{1}\) if and only if \(\sup_{s,t \ge 1}\phi_{st} < \infty\), while \(L_{\infty} \subseteq M(\phi)\) if and only if \(\sup_{s,t \ge1}(\phi_{st}/st) < \infty\); since the sequences \(\{ \phi_{st}\}\) and \(\{ st/\phi_{st}\}\) are monotonic, (ii) and (iii) are also satisfied. □
Theorem 3.2
- (i)
Given any ϕ of Θ, \(M(\phi)\neq L_{p}\) [\(N(\phi )\neq L_{q}\)].
- (ii)
In order that \(L_{p} \subset M(\phi)\) [\(N(\phi) \subset L_{q}\)], it is necessary and sufficient that \(\sup_{s,t \ge1} ( \frac{(st)^{1/q}}{\phi_{st}} ) < \infty\).
- (iii)
In order that \(M(\phi) \subset L_{p}\) [\(N(\phi) \supset L_{q}\)], it is necessary and sufficient that \(\Delta\phi\in L_{p}\).
- (iv)
\(\bigcup_{\Delta\phi\in L_{p}} M(\phi) = L_{p}\) [\(\bigcap_{\Delta\phi\in L_{p}} N(\phi) = L_{q} \)].
Proof
(i) Let us suppose that \(M(\phi) = L_{p}\).
(iii) By Theorem 2.2, we have \(\Delta\phi\in M(\phi)\). For sufficiency, we suppose that \(\Delta\phi\in L_{p}\) and that \(x \in M(\phi)\). Then \(\{ u_{mn}\Delta_{11}\phi_{mn}\} \in L_{1}\) whenever \(u \in L_{q}\), and it therefore follows from Lemma 2.2 that \(\{ u_{mn}x_{mn}\} \in L_{1}\) whenever \(u \in L_{q}\). Since \(L_{p}\) is the dual of \(L_{q}\) and since \(M(\phi)\neq L_{p}\), it follows that \(M(\phi) \subset L_{q}\).
(iv) By using (iii) we have \(\bigcup_{\Delta\phi\in L_{p}} M_{\phi} \subseteq L_{p}\). Now, for obtaining the complementary relation \(L_{p} \subseteq\bigcup_{\Delta\phi\in L_{p}} M_{\phi}\), let us suppose that \(x \in L_{p}\). Then \(\lim_{m,n \to\infty} x_{mn} = 0\), and hence there is an element u of \(S(x)\) such that \(\{ \vert u_{mn}\vert \}\) is a non-increasing sequence. If we take \(\psi= \{ \sum_{i,j = 1,1}^{m,n} \vert u_{ij}\vert \}\), then it is easy to verify that \(\psi\in \Theta\) and that \(x \in M(\phi)\). Since \(\Delta\psi\in L_{p}\), the complementary relation is satisfied. □
4 Application of \(M(\phi)\) and \(N(\phi)\) in clustering
In this section, we implement a k-means clustering algorithm by using \(M(\phi)\)-distance measure. Further, we apply the k-means algorithm into clustering to cluster two-moon data. The clustering result obtained by the \(M(\phi)\)-distance measure is compared with the results derived by the existing Euclidean distance measures (\(l_{2}\)).
4.1 Algorithm to compute \(M(\phi)\) distance
- (1)
Calculate \(a_{i} = \frac{1}{\phi_{1,i}}\vert x_{i} - y_{i}\vert \), \(i = 1,2,3, \ldots,n\).
- (2)The \(M(\phi)\)-distance between x and y is d, where$$d = \max\{ a_{1},a_{1} + a_{2}, \ldots,a_{1} + a_{2} + \cdots+ a_{n}\}. $$
4.2 K-means clustering algorithm for \(M(\phi)\)-distance measure
- (1)
Randomly/judiciously select k cluster centers (in this paper we choose first k data points as the cluster center \(y = [x_{1},x_{2},\ldots,x_{k}]\)).
- (2)
By using \(M(\phi)\) or \(N(\phi)\) distance measure (since both are dual of each other, in application point of view, we only consider \(M(\phi)\)), compute the distance between each data points and cluster centers.
- (3)
Put data points into the cluster whose \(M(\phi)\)-distance with its center is minimum.
- (4)
Define cluster centers for the new clusters evolved due to steps 1-3, the new cluster centers are computed as follows: \(c_{i} = \frac{1}{k_{i}}\sum_{j = 1}^{k_{i}} x_{i}\), where \(k_{i}\) denotes the number of points in the ith cluster.
- (5)
Repeat the above process until the difference between two consecutive cluster centers reaches less than a small number ε.
4.3 Two-moon dataset clustering by using \(M(\phi)\)-distance measure in k-means algorithm
5 Conclusions
In this paper, we defined Banach spaces \(M(\phi)\) and \(N(\phi)\) with discussion of their mathematical properties. Further, we proved some of their inclusion relation. Furthermore, we applied the distance measure induced by the Banach space \(M(\phi)\) into clustering to cluster the two-moon data by using the k-means clustering algorithm; the result of the experiment shows that the \(M(\phi)\)-distance measure extensively improves the clustering accuracy.
Declarations
Acknowledgements
The second and third authors gratefully acknowledge the financial support from King Abdulaziz University, Jeddah, Saudi Arabia.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
Authors’ Affiliations
References
- MacQueen, J, et al.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281-297 (1967) Google Scholar
- Bezdek, JC: A review of probabilistic, fuzzy, and neural models for pattern recognition. J. Intell. Fuzzy Syst. 1(1), 1-25 (1993) View ArticleGoogle Scholar
- Jain, AK: Data clustering: 50 years beyond k-means. Pattern Recognit. Lett. 31(8), 651-666 (2010) View ArticleGoogle Scholar
- Cheng, M-Y, Huang, K-Y, Chen, H-M: K-means particle swarm optimization with embedded chaotic search for solving multidimensional problems. Appl. Math. Comput. 219(6), 3091-3099 (2012) MathSciNetMATHGoogle Scholar
- Yao, H, Duan, Q, Li, D, Wang, J: An improved k-means clustering algorithm for fish image segmentation. Math. Comput. Model. 58(3), 790-798 (2013) View ArticleMATHGoogle Scholar
- Cap, M, Prez, A, Lozano, JA: An efficient approximation to the k-means clustering for massive data. Knowl.-Based Syst. 117, 56-69 (2016) View ArticleGoogle Scholar
- Güngör, Z, Ünler, A: K-harmonic means data clustering with simulated annealing heuristic. Appl. Math. Comput. 184(2), 199-209 (2007) MathSciNetMATHGoogle Scholar
- Sargent, W: Some sequence spaces related to the \(\ell_{p}\) spaces. J. Lond. Math. Soc. 1(2), 161-171 (1960) MathSciNetView ArticleMATHGoogle Scholar
- Mursaleen, M: Some geometric properties of a sequence space related to \(\ell_{p}\). Bull. Aust. Math. Soc. 67(2), 343-347 (2003) MathSciNetView ArticleMATHGoogle Scholar
- Mursaleen, M: Application of measure of noncompactness to infinite system of differential equations. Can. Math. Bull. 56, 388-394 (2013) MathSciNetView ArticleMATHGoogle Scholar
- Chen, L, Ng, R: On the marriage of \(\ell_{p}\)-norms and edit distance. In: Proceedings of the Thirtieth International Conference on Very Large Data Bases, vol. 30, pp. 792-803 (2004) Google Scholar
- Cristianini, N, Shawe-Taylor, J: An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods. Cambridge University Press, Cambridge (2000) View ArticleMATHGoogle Scholar
- Xu, Z, Chen, J, Wu, J: Clustering algorithm for intuitionistic fuzzy sets. Inf. Sci. 178(19), 3775-3790 (2008) MathSciNetView ArticleMATHGoogle Scholar
- Pringsheim, A: Zur theorie der zweifach unendlichen zahlenfolgen. Math. Ann. 53(3), 289-321 (1900) MathSciNetView ArticleMATHGoogle Scholar
- Mursaleen, M, Mohiuddine, SA: Convergence Methods for Double Sequences and Applications. Springer, Berlin (2014) View ArticleMATHGoogle Scholar
- Başar, F, Şever, Y: The space \(\mathcal{L}_{q}\) of double sequences. Math. J. Okayama Univ. 51, 149-157 (2009) MathSciNetMATHGoogle Scholar
- Altay, B, Başar, F: Some new spaces of double sequences. J. Math. Anal. Appl. 309(1), 70-90 (2005) MathSciNetView ArticleMATHGoogle Scholar
- Wilansky, A: Summability Through Functional Analysis. North Holland Math. Stud, vol. 85 (1984) MATHGoogle Scholar
- Jain, AK, Law, MHC: Data clustering: a users dilemma. In: Proceedings of the First International Conference on Pattern Recognition and Machine Intelligence (2005) Google Scholar
- Khan, MS, Lohani, QMD: A similarity measure for atanassov intuitionistic fuzzy sets and its application to clustering. In: Computational Intelligence (IWCI), International Workshop on. IEEE, Dhaka, Bangladesh (2016) Google Scholar